date:20221102

Re: [PATCH 3/3] vdpa: Expose VIRTIO_NET_F_STATUS unconditionally

2022-11-02 Thread Jason Wang

On Wed, Nov 2, 2022 at 7:19 PM Eugenio Perez Martin  wrote:
>
> On Tue, Nov 1, 2022 at 9:10 AM Jason Wang  wrote:
> >
> > On Fri, Oct 28, 2022 at 5:30 PM Eugenio Perez Martin
> >  wrote:
> > >
> > > On Fri, Oct 28, 2022 at 3:59 AM Jason Wang  wrote:
> > > >
> > > > On Thu, Oct 27, 2022 at 6:18 PM Eugenio Perez Martin
> > > >  wrote:
> > > > >
> > > > > On Thu, Oct 27, 2022 at 8:54 AM Jason Wang  
> > > > > wrote:
> > > > > >
> > > > > > On Thu, Oct 27, 2022 at 2:47 PM Eugenio Perez Martin
> > > > > >  wrote:
> > > > > > >
> > > > > > > On Thu, Oct 27, 2022 at 6:32 AM Jason Wang  
> > > > > > > wrote:
> > > > > > > >
> > > > > > > >
> > > > > > > > 在 2022/10/26 17:53, Eugenio Pérez 写道:
> > > > > > > > > Now that qemu can handle and emulate it if the vdpa backend 
> > > > > > > > > does not
> > > > > > > > > support it we can offer it always.
> > > > > > > > >
> > > > > > > > > Signed-off-by: Eugenio Pérez 
> > > > > > > >
> > > > > > > >
> > > > > > > > I may miss something but isn't more easier to simply remove the
> > > > > > > > _F_STATUS from vdpa_feature_bits[]?
> > > > > > > >
> > > > > > >
> > > > > > > How is that? if we remove it, the guest cannot ack it so it cannot
> > > > > > > access the net status, isn't it?
> > > > > >
> > > > > > My understanding is that the bits stored in the vdpa_feature_bits[]
> > > > > > are the features that must be explicitly supported by the vhost
> > > > > > device.
> > > > >
> > > > > (Non English native here, so maybe I don't get what you mean :) ) The
> > > > > device may not support them. net simulator lacks some of them
> > > > > actually, and it works.
> > > >
> > > > Speaking too fast, I think I meant that, if the bit doesn't belong to
> > > > vdpa_feature_bits[], it is assumed to be supported by the Qemu without
> > > > the support of the vhost. So Qemu won't even try to validate if vhost
> > > > has this support. E.g for vhost-net, we only have:
> > > >
> > > > static const int kernel_feature_bits[] = {
> > > > VIRTIO_F_NOTIFY_ON_EMPTY,
> > > > VIRTIO_RING_F_INDIRECT_DESC,
> > > > VIRTIO_RING_F_EVENT_IDX,
> > > > VIRTIO_NET_F_MRG_RXBUF,
> > > > VIRTIO_F_VERSION_1,
> > > > VIRTIO_NET_F_MTU,
> > > > VIRTIO_F_IOMMU_PLATFORM,
> > > > VIRTIO_F_RING_PACKED,
> > > > VIRTIO_NET_F_HASH_REPORT,
> > > > VHOST_INVALID_FEATURE_BIT
> > > > };
> > > >
> > > > You can see there's no STATUS bit there since it is emulated by Qemu.
> > > >
> > >
> > > Ok now I get what you mean, and yes we may modify the patches in that 
> > > direction.
> > >
> > > But if we go then we need to modify how qemu ack the features, because
> > > the features that are not in vdpa_feature_bits are not acked to the
> > > device. More on this later.
> > >
> > > > >
> > > > > From what I see these are the only features that will be forwarded to
> > > > > the guest as device_features. If it is not in the list, the feature
> > > > > will be masked out,
> > > >
> > > > Only when there's no support for this feature from the vhost.
> > > >
> > > > > as if the device does not support it.
> > > > >
> > > > > So now _F_STATUS it was forwarded only if the device supports it. If
> > > > > we remove it from bit_mask, it will never be offered to the guest. But
> > > > > we want to offer it always, since we will need it for
> > > > > _F_GUEST_ANNOUNCE.
> > > > >
> > > > > Things get more complex because we actually need to ack it back if the
> > > > > device offers it, so the vdpa device can report link_down. We will
> > > > > only emulate LINK_UP always in the case the device does not support
> > > > > _F_STATUS.
> > > > >
> > > > > > So if we remove _F_STATUS, Qemu vhost code won't validate if
> > > > > > vhost-vdpa device has this support:
> > > > > >
> > > > > > uint64_t vhost_get_features(struct vhost_dev *hdev, const int 
> > > > > > *feature_bits,
> > > > > > uint64_t features)
> > > > > > {
> > > > > > const int *bit = feature_bits;
> > > > > > while (*bit != VHOST_INVALID_FEATURE_BIT) {
> > > > > > uint64_t bit_mask = (1ULL << *bit);
> > > > > > if (!(hdev->features & bit_mask)) {
> > > > > > features &= ~bit_mask;
> > > > > > }
> > > > > > bit++;
> > > > > > }
> > > > > > return features;
> > > > > > }
> > > > > >
> > > > >
> > > > > Now maybe I'm the one missing something, but why is this not done as a
> > > > > masking directly?
> > > >
> > > > Not sure, the code has been there since day 0.
> > > >
> > > > But you can see from the code:
> > > >
> > > > 1) if STATUS is in feature_bits, we need validate the hdev->features
> > > > and mask it if the vhost doesn't have the support
> > > > 2) if STATUS is not, we don't do the check and driver may still see 
> > > > STATUS
> > > >
> > >
> > > That's useful for _F_GUEST_ANNOUNCE, but we need to ack _F_STATUS for
> > > the device if it supports it.
> >
> > Rethink about this, I don't see why ANNOUNCE depends on STATUS (spec
> >

[PATCH] tests/qtest/libqos/e1000e: Set E1000_CTRL_SLU

2022-11-02 Thread Akihiko Odaki

The later device status check depends on E1000_STATUS_LU, which is
enabled by E1000_CTRL_SLU. Though E1000_STATUS_LU is not implemented
and E1000_STATUS_LU is always available in the current implementation,
be a bit nicer and set E1000_CTRL_SLU just in case the bit is
implemented in the future.

Signed-off-by: Akihiko Odaki 
---
 tests/qtest/libqos/e1000e.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tests/qtest/libqos/e1000e.c b/tests/qtest/libqos/e1000e.c
index 1f2b8f..11d9f55c66 100644
--- a/tests/qtest/libqos/e1000e.c
+++ b/tests/qtest/libqos/e1000e.c
@@ -122,7 +122,7 @@ static void e1000e_pci_start_hw(QOSGraphObject *obj)
 
 /* Reset the device */
 val = e1000e_macreg_read(>e1000e, E1000_CTRL);
-e1000e_macreg_write(>e1000e, E1000_CTRL, val | E1000_CTRL_RST);
+e1000e_macreg_write(>e1000e, E1000_CTRL, val | E1000_CTRL_RST | 
E1000_CTRL_SLU);
 
 /* Enable and configure MSI-X */
 qpci_msix_enable(>pci_dev);
-- 
2.38.1

Re: [PATCH v3 4/4] hw/nvme: add polling support

2022-11-02 Thread Jinhao Fan

at 7:10 PM, Klaus Jensen  wrote:

> This doesn't do what you expect it to. By not updaring the eventidx it
> will fall behind the actual head, causing the host to think that the
> device is not processing events (but it is!), resulting in doorbell
> ringing.

I’m not sure I understand this correctly. 

In 7.13.1 in NVMe Spec 1.4c it says "If updating an entry in the Shadow
Doorbell buffer **changes** the value from being less than or equal to the
value of the corresponding EventIdx buffer entry to being greater than that
value, then the host shall also update the controller's corresponding
doorbell register to match the value of that entry in the Shadow Doorbell
buffer.”

So my understanding is that once the eventidx falls behind the actual head,
the host will only ring the doorbell once but *not* for future submissions.

Is this not what real hosts are doing?

Re: [PATCH v3 3/4] hw/nvme: add iothread support

2022-11-02 Thread Jinhao Fan

at 7:13 PM, Klaus Jensen  wrote:

> Otherwise, it all looks fine. I'm still seeing the weird slowdown when
> an iothread is enabled. I have yet to figure out why that is... But it
> scales! :)

How much slowdown do you observe on your machine?

[PATCH] tests/qtest/libqos/e1000e: Refer common PCI ID definitions

2022-11-02 Thread Akihiko Odaki

This is yet another minor cleanup to ease understanding and
future refactoring of the tests.

Signed-off-by: Akihiko Odaki 
---
 tests/qtest/libqos/e1000e.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/tests/qtest/libqos/e1000e.c b/tests/qtest/libqos/e1000e.c
index 2ea5db65d8..1f2b8f 100644
--- a/tests/qtest/libqos/e1000e.c
+++ b/tests/qtest/libqos/e1000e.c
@@ -18,6 +18,7 @@
 
 #include "qemu/osdep.h"
 #include "hw/net/e1000_regs.h"
+#include "hw/pci/pci_ids.h"
 #include "../libqtest.h"
 #include "pci-pc.h"
 #include "qemu/sockets.h"
@@ -217,8 +218,8 @@ static void *e1000e_pci_create(void *pci_bus, 
QGuestAllocator *alloc,
 static void e1000e_register_nodes(void)
 {
 QPCIAddress addr = {
-.vendor_id = 0x8086,
-.device_id = 0x10D3,
+.vendor_id = PCI_VENDOR_ID_INTEL,
+.device_id = E1000_DEV_ID_82574L,
 };
 
 /* FIXME: every test using this node needs to setup a -netdev socket,id=hs0
-- 
2.38.1

Re: [PATCH v6 0/3] ppc/e500: Add support for eSDHC

2022-11-02 Thread Daniel Henrique Barboza





On 11/1/22 19:29, Philippe Mathieu-Daudé wrote:

This is a respin of Bernhard's v4 with Freescale eSDHC implemented
as an 'UNIMP' region. See v4 cover here:
https://lore.kernel.org/qemu-devel/20221018210146.193159-1-shen...@gmail.com/

Since v5:
- Rebased (ppc-next merged)
- Properly handle big-endian

Since v4:
- Do not rename ESDHC_* definitions to USDHC_*
- Do not modify SDHCIState structure

Supersedes: <20221031115402.91912-1-phi...@linaro.org>


Queued in gitlab.com/danielhb/qemu/tree/ppc-8.0 (since we missed the
freeze for 7.2).


BTW, checkpatch complained about this line being too long (83 chars):


3/3 Checking commit bc7b8cc88560 (hw/ppc/e500: Add Freescale eSDHC to e500plat)
WARNING: line over 80 characters
#150: FILE: hw/ppc/e500.c:1024:
+pmc->ccsrbar_base + 
MPC85XX_ESDHC_REGS_OFFSET,


The code except is this:

if (pmc->has_esdhc) {
create_unimplemented_device("esdhc",
pmc->ccsrbar_base + 
MPC85XX_ESDHC_REGS_OFFSET,
MPC85XX_ESDHC_REGS_SIZE);


To get rid of the warning we would need to make a python-esque identation (line
break after "(" ) or create a new variable to hold the sum. Both seems overkill
so I'll ignore the warning. Phil is welcome to re-send if he thinks it's worth
it.


And I'll follow it up with my usual plea in these cases: can we move the line 
size
warning to 100 chars? For QEMU 8.0? Pretty please?


Daniel




Philippe Mathieu-Daudé (3):
   hw/sd/sdhci: MMIO region is implemented in 32-bit accesses
   hw/sd/sdhci: Support big endian SD host controller interfaces
   hw/ppc/e500: Add Freescale eSDHC to e500plat

  docs/system/ppc/ppce500.rst | 13 ++
  hw/ppc/Kconfig  |  2 ++
  hw/ppc/e500.c   | 48 -
  hw/ppc/e500.h   |  1 +
  hw/ppc/e500plat.c   |  1 +
  hw/sd/sdhci-internal.h  |  1 +
  hw/sd/sdhci.c   | 36 +---
  include/hw/sd/sdhci.h   |  1 +
  8 files changed, 99 insertions(+), 4 deletions(-)

Re: [PATCH] virtio-blk: simplify virtio_blk_dma_restart_cb()

2022-11-02 Thread Michael S. Tsirkin

On Wed, Nov 02, 2022 at 02:23:37PM -0400, Stefan Hajnoczi wrote:
> virtio_blk_dma_restart_cb() is tricky because the BH must deal with
> virtio_blk_data_plane_start()/virtio_blk_data_plane_stop() being called.
> 
> There are two issues with the code:
> 
> 1. virtio_blk_realize() should use qdev_add_vm_change_state_handler()
>instead of qemu_add_vm_change_state_handler(). This ensures the
>ordering with virtio_init()'s vm change state handler that calls
>virtio_blk_data_plane_start()/virtio_blk_data_plane_stop() is
>well-defined. Then blk's AioContext is guaranteed to be up-to-date in
>virtio_blk_dma_restart_cb() and it's no longer necessary to have a
>special case for virtio_blk_data_plane_start().
> 
> 2. Only blk_drain() waits for virtio_blk_dma_restart_cb()'s
>blk_inc_in_flight() to be decremented. The bdrv_drain() family of
>functions do not wait for BlockBackend's in_flight counter to reach
>zero. virtio_blk_data_plane_stop() relies on blk_set_aio_context()'s
>implicit drain, but that's a bdrv_drain() and not a blk_drain().
>Note that virtio_blk_reset() already correctly relies on blk_drain().
>If virtio_blk_data_plane_stop() switches to blk_drain() then we can
>properly wait for pending virtio_blk_dma_restart_bh() calls.
> 
> Once these issues are taken care of the code becomes simpler. This
> change is in preparation for multiple IOThreads in virtio-blk where we
> need to clean up the multi-threading behavior.
> 
> I ran the reproducer from commit 49b44549ace7 ("virtio-blk: On restart,
> process queued requests in the proper context") to check that there is
> no regression.
> 
> Cc: Sergio Lopez 
> Cc: Kevin Wolf 
> Cc: Emanuele Giuseppe Esposito 
> Signed-off-by: Stefan Hajnoczi 


Acked-by: Michael S. Tsirkin 

> ---
>  include/hw/virtio/virtio-blk.h  |  2 --
>  hw/block/dataplane/virtio-blk.c | 17 +---
>  hw/block/virtio-blk.c   | 46 ++---
>  3 files changed, 26 insertions(+), 39 deletions(-)
> 
> diff --git a/include/hw/virtio/virtio-blk.h b/include/hw/virtio/virtio-blk.h
> index 7f589b4146..dafec432ce 100644
> --- a/include/hw/virtio/virtio-blk.h
> +++ b/include/hw/virtio/virtio-blk.h
> @@ -55,7 +55,6 @@ struct VirtIOBlock {
>  VirtIODevice parent_obj;
>  BlockBackend *blk;
>  void *rq;
> -QEMUBH *bh;
>  VirtIOBlkConf conf;
>  unsigned short sector_mask;
>  bool original_wce;
> @@ -93,6 +92,5 @@ typedef struct MultiReqBuffer {
>  } MultiReqBuffer;
>  
>  void virtio_blk_handle_vq(VirtIOBlock *s, VirtQueue *vq);
> -void virtio_blk_process_queued_requests(VirtIOBlock *s, bool is_bh);
>  
>  #endif
> diff --git a/hw/block/dataplane/virtio-blk.c b/hw/block/dataplane/virtio-blk.c
> index 26f965cabc..b28d81737e 100644
> --- a/hw/block/dataplane/virtio-blk.c
> +++ b/hw/block/dataplane/virtio-blk.c
> @@ -237,9 +237,6 @@ int virtio_blk_data_plane_start(VirtIODevice *vdev)
>  goto fail_aio_context;
>  }
>  
> -/* Process queued requests before the ones in vring */
> -virtio_blk_process_queued_requests(vblk, false);
> -
>  /* Kick right away to begin processing requests already in vring */
>  for (i = 0; i < nvqs; i++) {
>  VirtQueue *vq = virtio_get_queue(s->vdev, i);
> @@ -272,11 +269,6 @@ int virtio_blk_data_plane_start(VirtIODevice *vdev)
>fail_host_notifiers:
>  k->set_guest_notifiers(qbus->parent, nvqs, false);
>fail_guest_notifiers:
> -/*
> - * If we failed to set up the guest notifiers queued requests will be
> - * processed on the main context.
> - */
> -virtio_blk_process_queued_requests(vblk, false);
>  vblk->dataplane_disabled = true;
>  s->starting = false;
>  vblk->dataplane_started = true;
> @@ -325,8 +317,13 @@ void virtio_blk_data_plane_stop(VirtIODevice *vdev)
>  aio_context_acquire(s->ctx);
>  aio_wait_bh_oneshot(s->ctx, virtio_blk_data_plane_stop_bh, s);
>  
> -/* Drain and try to switch bs back to the QEMU main loop. If other users
> - * keep the BlockBackend in the iothread, that's ok */
> +/* Wait for virtio_blk_dma_restart_bh() and in flight I/O to complete */
> +blk_drain(s->conf->conf.blk);
> +
> +/*
> + * Try to switch bs back to the QEMU main loop. If other users keep the
> + * BlockBackend in the iothread, that's ok
> + */
>  blk_set_aio_context(s->conf->conf.blk, qemu_get_aio_context(), NULL);
>  
>  aio_context_release(s->ctx);
> diff --git a/hw/block/virtio-blk.c b/hw/block/virtio-blk.c
> index f717550fdc..1762517878 100644
> --- a/hw/block/virtio-blk.c
> +++ b/hw/block/virtio-blk.c
> @@ -806,8 +806,10 @@ static void virtio_blk_handle_output(VirtIODevice *vdev, 
> VirtQueue *vq)
>  virtio_blk_handle_vq(s, vq);
>  }
>  
> -void virtio_blk_process_queued_requests(VirtIOBlock *s, bool is_bh)
> +static void virtio_blk_dma_restart_bh(void *opaque)
>  {
> +VirtIOBlock *s = opaque;
> +
>  VirtIOBlockReq *req = s->rq;
>

Re: [PATCH] Run docker probe only if docker or podman are available

2022-11-02 Thread Alex Bennée



Stefan Weil  writes:

> The docker probe uses "sudo -n" which can cause an e-mail with a security 
> warning
> each time when configure is run. Therefore run docker probe only if either 
> docker
> or podman are available.
>
> That avoids the problematic "sudo -n" on build environments which have neither
> docker nor podman installed.

Queued to for-7.2/misc-fixes, thanks.

-- 
Alex Bennée

Re: [PATCH v9 1/8] mm: Introduce memfd_restricted system call to create restricted user memory

2022-11-02 Thread Michael Roth

On Thu, Nov 03, 2022 at 12:14:04AM +0300, Kirill A. Shutemov wrote:
> On Mon, Oct 31, 2022 at 12:47:38PM -0500, Michael Roth wrote:
> > 
> > In v8 there was some discussion about potentially passing the page/folio
> > and order as part of the invalidation callback, I ended up needing
> > something similar for SEV-SNP, and think it might make sense for other
> > platforms. This main reasoning is:
> > 
> >   1) restoring kernel directmap:
> > 
> >  Currently SNP (and I believe TDX) need to either split or remove kernel
> >  direct mappings for restricted PFNs, since there is no guarantee that
> >  other PFNs within a 2MB range won't be used for non-restricted
> >  (which will cause an RMP #PF in the case of SNP since the 2MB
> >  mapping overlaps with guest-owned pages)
> 
> That's news to me. Where the restriction for SNP comes from?

Sorry, missed your first question.

For SNP at least, the restriction is documented in APM Volume 2, Section
15.36.10, First row of Table 15-36 (preceeding paragraph has more
context). I forgot to mention this is only pertaining to writes by the
host to 2MB pages that contain guest-owned subpages, for reads it's
not an issue, but I think the implementation requirements end up being
the same either way:

  https://www.amd.com/system/files/TechDocs/24593.pdf

-Mike

> That's news to me. Where the restriction for SNP comes from? There's no
> such limitation on TDX side AFAIK?
> 
> Could you point me to relevant documentation if there's any?
> 
> -- 
>   Kiryl Shutsemau / Kirill A. Shutemov

Re: [PATCH v9 1/8] mm: Introduce memfd_restricted system call to create restricted user memory

2022-11-02 Thread Michael Roth

On Thu, Nov 03, 2022 at 12:14:04AM +0300, Kirill A. Shutemov wrote:
> On Mon, Oct 31, 2022 at 12:47:38PM -0500, Michael Roth wrote:
> > 
> > In v8 there was some discussion about potentially passing the page/folio
> > and order as part of the invalidation callback, I ended up needing
> > something similar for SEV-SNP, and think it might make sense for other
> > platforms. This main reasoning is:
> > 
> >   1) restoring kernel directmap:
> > 
> >  Currently SNP (and I believe TDX) need to either split or remove kernel
> >  direct mappings for restricted PFNs, since there is no guarantee that
> >  other PFNs within a 2MB range won't be used for non-restricted
> >  (which will cause an RMP #PF in the case of SNP since the 2MB
> >  mapping overlaps with guest-owned pages)
> 
> That's news to me. Where the restriction for SNP comes from? There's no
> such limitation on TDX side AFAIK?
> 
> Could you point me to relevant documentation if there's any?

I could be mistaken, I haven't looked into the specific documentation and was
going off of this discussion from a ways back:

  https://lore.kernel.org/all/ywb8wg6ravbs1...@google.com/

Sean, is my read of that correct? Do you happen to know where there's
some documentation on that for the TDX side?

Thanks,

Mike

> 
> -- 
>   Kiryl Shutsemau / Kirill A. Shutemov

Re: [PATCH v9 1/8] mm: Introduce memfd_restricted system call to create restricted user memory

2022-11-02 Thread Michael Roth

On Wed, Nov 02, 2022 at 10:53:25PM +0800, Chao Peng wrote:
> On Tue, Nov 01, 2022 at 02:30:58PM -0500, Michael Roth wrote:
> > On Tue, Nov 01, 2022 at 10:19:44AM -0500, Michael Roth wrote:
> > > On Tue, Nov 01, 2022 at 07:37:29PM +0800, Chao Peng wrote:
> > > > On Mon, Oct 31, 2022 at 12:47:38PM -0500, Michael Roth wrote:
> > > > > On Tue, Oct 25, 2022 at 11:13:37PM +0800, Chao Peng wrote:
> > > > 
> > > > > 
> > > > >   3) Potentially useful for hugetlbfs support:
> > > > > 
> > > > >  One issue with hugetlbfs is that we don't support splitting the
> > > > >  hugepage in such cases, which was a big obstacle prior to UPM. 
> > > > > Now
> > > > >  however, we may have the option of doing "lazy" invalidations 
> > > > > where
> > > > >  fallocate(PUNCH_HOLE, ...) won't free a shmem-allocate page 
> > > > > unless
> > > > >  all the subpages within the 2M range are either hole-punched, or 
> > > > > the
> > > > >  guest is shut down, so in that way we never have to split it. 
> > > > > Sean
> > > > >  was pondering something similar in another thread:
> > > > > 
> > > > >
> > > > > https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore.kernel.org%2Flinux-mm%2FYyGLXXkFCmxBfu5U%40google.com%2Fdata=05%7C01%7Cmichael.roth%40amd.com%7C13192ae987b442f10b7408dabce2a4c5%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C638029978853935768%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7Csdata=Is%2Bfm3c9BGFmU%2Btn3ZgPPQnUeCK%2BhKPArsPrWY5JeSg%3Dreserved=0
> > > > > 
> > > > >  Issuing invalidations with folio-granularity ties in fairly well
> > > > >  with this sort of approach if we end up going that route.
> > > > 
> > > > There is semantics difference between the current one and the proposed
> > > > one: The invalidation range is exactly what userspace passed down to the
> > > > kernel (being fallocated) while the proposed one will be subset of that
> > > > (if userspace-provided addr/size is not aligned to power of two), I'm
> > > > not quite confident this difference has no side effect.
> > > 
> > > In theory userspace should not be allocating/hole-punching restricted
> > > pages for GPA ranges that are already mapped as private in the xarray,
> > > and KVM could potentially fail such requests (though it does currently).
> > > 
> > > But if we somehow enforced that, then we could rely on
> > > KVM_MEMORY_ENCRYPT_REG_REGION to handle all the MMU invalidation stuff,
> > > which would free up the restricted fd invalidation callbacks to be used
> > > purely to handle doing things like RMP/directmap fixups prior to returning
> > > restricted pages back to the host. So that was sort of my thinking why the
> > > new semantics would still cover all the necessary cases.
> > 
> > Sorry, this explanation is if we rely on userspace to fallocate() on 2MB
> > boundaries, and ignore any non-aligned requests in the kernel. But
> > that's not how I actually ended up implementing things, so I'm not sure
> > why answered that way...
> > 
> > In my implementation we actually do issue invalidations for fallocate()
> > even for non-2M-aligned GPA/offset ranges. For instance (assuming
> > restricted FD offset 0 corresponds to GPA 0), an fallocate() on GPA
> > range 0x1000-0x402000 would result in the following invalidations being
> > issued if everything was backed by a 2MB page:
> > 
> >   invalidate GPA: 0x001000-0x20, Page: pfn_to_page(I), order:9
> >   invalidate GPA: 0x20-0x40, Page: pfn_to_page(J), order:9
> >   invalidate GPA: 0x40-0x402000, Page: pfn_to_page(K), order:9
> 
> Only see this I understand what you are actually going to propose;)
> 
> So the memory range(start/end) will be still there and covers exactly
> what it should be from usrspace point of view, the page+order(or just
> folio) is really just a _hint_ for the invalidation callbacks. Looks
> ugly though.

Yes that's accurate: callbacks still need to handle partial ranges, so
it's more of a hint/optimization for cases where callbacks can benefit
from knowing the entire backing hugepage is being invalidated/freed.

> 
> In v9 we use a invalidate_start/ invalidate_end pair to solve a race
> contention 
> issue(https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore.kernel.org%2Fkvm%2FY1LOe4JvnTbFNs4u%40google.com%2Fdata=05%7C01%7Cmichael.roth%40amd.com%7C13192ae987b442f10b7408dabce2a4c5%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C638029978853935768%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7Csdata=zccj0lNcqBCxGVGLBYAD2BCkJuy75nTxFTSUMfDJjzM%3Dreserved=0).
> To work with this, I believe we only need pass this hint info for
> invalidate_start() since at the invalidate_end() time, the page has
> already been discarded.

Ok, yah, that's the approach I'm looking at for v9: pass the page/order
for invalidate_start, but keep invalidate_end as-is.

> 
> Another

Re: [RFC v4 3/3] virtio-blk: add some trace events for zoned emulation

2022-11-02 Thread Philippe Mathieu-Daudé


Hi,

On 30/10/22 10:32, Sam Li wrote:

Signed-off-by: Sam Li 
---
  hw/block/trace-events |  7 +++
  hw/block/virtio-blk.c | 12 
  2 files changed, 19 insertions(+)

diff --git a/hw/block/trace-events b/hw/block/trace-events
index 2c45a62bd5..f47da6fcd4 100644
--- a/hw/block/trace-events
+++ b/hw/block/trace-events
@@ -44,9 +44,16 @@ pflash_write_unknown(const char *name, uint8_t cmd) "%s: unknown 
command 0x%02x"
  # virtio-blk.c
  virtio_blk_req_complete(void *vdev, void *req, int status) "vdev %p req %p status 
%d"
  virtio_blk_rw_complete(void *vdev, void *req, int ret) "vdev %p req %p ret %d"
+virtio_blk_zone_report_complete(void *vdev, void *req, unsigned int nr_zones, int ret) 
"vdev %p req %p nr_zones %d ret %d"


" ... nr_zones %u ..."


+virtio_blk_zone_mgmt_complete(void *vdev, void *req, int ret) "vdev %p req %p ret 
%d"
+virtio_blk_zone_append_complete(void *vdev, void *req, int64_t sector, int ret) "vdev %p req 
%p, append sector 0x%" PRIx64 " ret %d"
  virtio_blk_handle_write(void *vdev, void *req, uint64_t sector, size_t nsectors) "vdev %p 
req %p sector %"PRIu64" nsectors %zu"
  virtio_blk_handle_read(void *vdev, void *req, uint64_t sector, size_t nsectors) "vdev %p req 
%p sector %"PRIu64" nsectors %zu"
  virtio_blk_submit_multireq(void *vdev, void *mrb, int start, int num_reqs, uint64_t offset, 
size_t size, bool is_write) "vdev %p mrb %p start %d num_reqs %d offset %"PRIu64" 
size %zu is_write %d"


" ... is_write %u"


+virtio_blk_handle_zone_report(void *vdev, void *req, int64_t sector, unsigned int nr_zones) 
"vdev %p req %p sector 0x%" PRIx64 " nr_zones %d"


" ... nr_zones %u"


+virtio_blk_handle_zone_mgmt(void *vdev, void *req, uint8_t op, int64_t sector, int64_t len) "vdev %p 
req %p op 0x%x sector 0x%" PRIx64 " len 0x%" PRIx64 ""
+virtio_blk_handle_zone_reset_all(void *vdev, void *req, int64_t sector, int64_t len) "vdev %p req %p 
sector 0x%" PRIx64 " cap 0x%" PRIx64 ""
+virtio_blk_handle_zone_append(void *vdev, void *req, int64_t sector) "vdev %p req %p, append 
sector 0x%" PRIx64 ""


You can probably drop the trailing "".

Regards,

Phil.

Re: [PATCH v9 1/8] mm: Introduce memfd_restricted system call to create restricted user memory

2022-11-02 Thread Kirill A. Shutemov

On Mon, Oct 31, 2022 at 12:47:38PM -0500, Michael Roth wrote:
> 
> In v8 there was some discussion about potentially passing the page/folio
> and order as part of the invalidation callback, I ended up needing
> something similar for SEV-SNP, and think it might make sense for other
> platforms. This main reasoning is:
> 
>   1) restoring kernel directmap:
> 
>  Currently SNP (and I believe TDX) need to either split or remove kernel
>  direct mappings for restricted PFNs, since there is no guarantee that
>  other PFNs within a 2MB range won't be used for non-restricted
>  (which will cause an RMP #PF in the case of SNP since the 2MB
>  mapping overlaps with guest-owned pages)

That's news to me. Where the restriction for SNP comes from? There's no
such limitation on TDX side AFAIK?

Could you point me to relevant documentation if there's any?

-- 
  Kiryl Shutsemau / Kirill A. Shutemov

Re: [RFC v4 3/3] virtio-blk: add some trace events for zoned emulation

2022-11-02 Thread Stefan Hajnoczi

On Sun, Oct 30, 2022 at 05:32:42AM -0400, Sam Li wrote:
> Signed-off-by: Sam Li 
> ---
>  hw/block/trace-events |  7 +++
>  hw/block/virtio-blk.c | 12 
>  2 files changed, 19 insertions(+)

Reviewed-by: Stefan Hajnoczi 


signature.asc
Description: PGP signature

Re: [RFC v4 2/3] virtio-blk: add zoned storage emulation for zoned devices

2022-11-02 Thread Stefan Hajnoczi

On Sun, Oct 30, 2022 at 05:32:41AM -0400, Sam Li wrote:
> This patch extends virtio-blk emulation to handle zoned device commands
> by calling the new block layer APIs to perform zoned device I/O on
> behalf of the guest. It supports Report Zone, four zone oparations (open,
> close, finish, reset), and Append Zone.
> 
> The VIRTIO_BLK_F_ZONED feature bit will only be set if the host does
> support zoned block devices. Regular block devices(conventional zones)
> will not be set.
> 
> The guest os can use blktests, fio to test those commands on zoned devices.
> Furthermore, using zonefs to test zone append write is also supported.
> 
> Signed-off-by: Sam Li 
> ---
>  hw/block/virtio-blk-common.c |   2 +
>  hw/block/virtio-blk.c| 387 +++
>  2 files changed, 389 insertions(+)
> 
> diff --git a/hw/block/virtio-blk-common.c b/hw/block/virtio-blk-common.c
> index ac52d7c176..e2f8e2f6da 100644
> --- a/hw/block/virtio-blk-common.c
> +++ b/hw/block/virtio-blk-common.c
> @@ -29,6 +29,8 @@ static const VirtIOFeature feature_sizes[] = {
>   .end = endof(struct virtio_blk_config, discard_sector_alignment)},
>  {.flags = 1ULL << VIRTIO_BLK_F_WRITE_ZEROES,
>   .end = endof(struct virtio_blk_config, write_zeroes_may_unmap)},
> +{.flags = 1ULL << VIRTIO_BLK_F_ZONED,
> + .end = endof(struct virtio_blk_config, zoned)},
>  {}
>  };
>  
> diff --git a/hw/block/virtio-blk.c b/hw/block/virtio-blk.c
> index 8131ec2dbc..4f3625840a 100644
> --- a/hw/block/virtio-blk.c
> +++ b/hw/block/virtio-blk.c
> @@ -26,6 +26,9 @@
>  #include "hw/virtio/virtio-blk.h"
>  #include "dataplane/virtio-blk.h"
>  #include "scsi/constants.h"
> +#if defined(CONFIG_BLKZONED)
> +#include 
> +#endif

virtio-blk.c should use QEMU block layer APIs instead of Linux-specific
APIs. Can this include be removed?

>  #ifdef __linux__
>  # include 
>  #endif
> @@ -592,6 +595,332 @@ err:
>  return err_status;
>  }
>  
> +typedef struct ZoneCmdData {
> +VirtIOBlockReq *req;
> +union {
> +struct {
> +unsigned int nr_zones;
> +BlockZoneDescriptor *zones;
> +} zone_report_data;
> +struct {
> +int64_t offset;
> +} zone_append_data;
> +};
> +} ZoneCmdData;
> +
> +/*
> + * check zoned_request: error checking before issuing requests. If all checks
> + * passed, return true.
> + * append: true if only zone append requests issued.
> + */
> +static bool check_zoned_request(VirtIOBlock *s, int64_t offset, int64_t len,
> + bool append, uint8_t *status) {
> +BlockDriverState *bs = blk_bs(s->blk);
> +int index = offset / bs->bl.zone_size;
> +
> +if (offset < 0 || len < 0 || offset > (bs->total_sectors << 
> BDRV_SECTOR_BITS) - len) {

Missing len > (bs->total_sectors << BDRV_SECTOR_BITS) check before
subtracting len.

> +*status = VIRTIO_BLK_S_ZONE_INVALID_CMD;
> +return false;
> +}
> +
> +if (!virtio_has_feature(s->host_features, VIRTIO_BLK_F_ZONED)) {
> +*status = VIRTIO_BLK_S_UNSUPP;
> +return false;
> +}
> +
> +if (append) {
> +if ((offset % bs->bl.write_granularity) != 0) {
> +*status = VIRTIO_BLK_S_ZONE_UNALIGNED_WP;
> +return false;
> +}
> +
> +if (BDRV_ZT_IS_CONV(bs->bl.wps->wp[index])) {
> +*status = VIRTIO_BLK_S_ZONE_INVALID_CMD;
> +return false;
> +}
> +
> +if (len / 512 > bs->bl.max_append_sectors) {
> +if (bs->bl.max_append_sectors == 0) {
> +*status = VIRTIO_BLK_S_UNSUPP;
> +} else {
> +*status = VIRTIO_BLK_S_ZONE_INVALID_CMD;
> +}
> +return false;
> +}
> +}
> +return true;
> +}
> +
> +static void virtio_blk_zone_report_complete(void *opaque, int ret)
> +{
> +ZoneCmdData *data = opaque;
> +VirtIOBlockReq *req = data->req;
> +VirtIOBlock *s = req->dev;
> +VirtIODevice *vdev = VIRTIO_DEVICE(req->dev);
> +struct iovec *in_iov = req->elem.in_sg;
> +unsigned in_num = req->elem.in_num;
> +int64_t zrp_size, n, j = 0;
> +int64_t nz = data->zone_report_data.nr_zones;
> +int8_t err_status = VIRTIO_BLK_S_OK;
> +
> +if (ret) {
> +err_status = VIRTIO_BLK_S_ZONE_INVALID_CMD;
> +goto out;
> +}
> +
> +struct virtio_blk_zone_report zrp_hdr = (struct virtio_blk_zone_report) {
> +.nr_zones = cpu_to_le64(nz),
> +};
> +zrp_size = sizeof(struct virtio_blk_zone_report)
> +   + sizeof(struct virtio_blk_zone_descriptor) * nz;
> +n = iov_from_buf(in_iov, in_num, 0, _hdr, sizeof(zrp_hdr));
> +if (n != sizeof(zrp_hdr)) {
> +virtio_error(vdev, "Driver provided intput buffer that is too 
> small!");
> +err_status = VIRTIO_BLK_S_ZONE_INVALID_CMD;
> +goto out;
> +}
> +
> +for (size_t i = sizeof(zrp_hdr); i < zrp_size;
> +i += sizeof(struct

Re: [PATCH v4 0/2] Refactoring: expand usage of TFR() macro

2022-11-02 Thread Nikita Ivanov

Hi!
Is there any update on this? I haven't received any comments.

On Sun, Oct 23, 2022 at 12:04 PM Nikita Ivanov 
wrote:

> At the moment, TFR() macro has a vague name and is not used
> where it possibly could be. In order to make it more transparent
> and useful, it was decided to refactor it to make it closer to
> the similar one in glibc: TEMP_FAILURE_RETRY(). Now, macro
> evaluates into an expression and is named RETRY_ON_EINTR(). All the
> places where RETRY_ON_EINTR() macro code be applied were covered.
>
> Nikita Ivanov (2):
>   Refactoring: refactor TFR() macro to RETRY_ON_EINTR()
>   error handling: Use RETRY_ON_EINTR() macro where applicable
>
>  block/file-posix.c| 37 -
>  chardev/char-fd.c |  2 +-
>  chardev/char-pipe.c   |  8 +---
>  chardev/char-pty.c|  4 +---
>  hw/9pfs/9p-local.c|  8 ++--
>  include/qemu/osdep.h  |  8 +++-
>  net/l2tpv3.c  | 17 +
>  net/socket.c  | 16 +++-
>  net/tap-bsd.c |  6 +++---
>  net/tap-linux.c   |  2 +-
>  net/tap-solaris.c |  8 
>  net/tap.c | 10 +++---
>  os-posix.c|  2 +-
>  qga/commands-posix.c  |  4 +---
>  semihosting/syscalls.c|  4 +---
>  tests/qtest/libqtest.c| 14 ++
>  tests/vhost-user-bridge.c |  4 +---
>  util/main-loop.c  |  4 +---
>  util/osdep.c  |  4 +---
>  util/vfio-helpers.c   | 12 ++--
>  20 files changed, 73 insertions(+), 101 deletions(-)
>
> --
> 2.37.3
>
>

-- 
Best Regards,
*Nikita Ivanov* | C developer
*Telephone:* +79140870696
*Telephone:* +79015053149

Re: [PATCH] Run docker probe only if docker or podman are available

2022-11-02 Thread Stefan Weil via


Am 30.10.22 um 09:35 schrieb Stefan Weil:


The docker probe uses "sudo -n" which can cause an e-mail with a security 
warning
each time when configure is run. Therefore run docker probe only if either 
docker
or podman are available.

That avoids the problematic "sudo -n" on build environments which have neither
docker nor podman installed.

Fixes: c4575b59155e2e00 ("configure: store container engine in config-host.mak")
Signed-off-by: Stefan Weil 
---
  configure | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/configure b/configure
index 81561be7c1..3af99282b7 100755
--- a/configure
+++ b/configure
@@ -1779,7 +1779,7 @@ fi
  # functions to probe cross compilers
  
  container="no"

-if test $use_containers = "yes"; then
+if test $use_containers = "yes" && (has "docker" || has "podman"); then
  case $($python "$source_path"/tests/docker/docker.py probe) in
  *docker) container=docker ;;
  podman) container=podman ;;



Can this patch be applied by qemu-trivial? For me those security e-mails 
are a bug and should be at least avoided as far as possible in the new 
QEMU release.


Thanks, Stefan

Re: [PATCH] tests/unit/test-io-channel-command: Silence GCC error "maybe-uninitialized"

2022-11-02 Thread Alex Bennée



Bernhard Beschow  writes:

> GCC issues a false positive warning, resulting in build failure with -Werror:
>
>   In file included from /usr/lib/glib-2.0/include/glibconfig.h:9,
>from /usr/include/glib-2.0/glib/gtypes.h:34,
>from /usr/include/glib-2.0/glib/galloca.h:34,
>from /usr/include/glib-2.0/glib.h:32,
>from ../src/include/glib-compat.h:32,
>from ../src/include/qemu/osdep.h:144,
>from ../src/tests/unit/test-io-channel-command.c:21:
>   /usr/include/glib-2.0/glib/gmacros.h: In function 
> ‘test_io_channel_command_fifo’:
>   /usr/include/glib-2.0/glib/gmacros.h:1333:105: error: ‘dstargv’ may be used 
> uninitialized [-Werror=maybe-uninitialized]
>1333 |   static G_GNUC_UNUSED inline void _GLIB_AUTO_FUNC_NAME(TypeName) 
> (TypeName *_ptr) { if (*_ptr != none) (func) (*_ptr); } \
> | 
> ^
>   ../src/tests/unit/test-io-channel-command.c:39:19: note: ‘dstargv’ was 
> declared here
>  39 | g_auto(GStrv) dstargv;
> |   ^~~
>   /usr/include/glib-2.0/glib/gmacros.h:1333:105: error: ‘srcargv’ may
> be used uninitialized [-Werror=maybe-uninitialized]
>1333 | static G_GNUC_UNUSED inline void
> _GLIB_AUTO_FUNC_NAME(TypeName) (TypeName *_ptr) { if (*_ptr != none)
> (func) (*_ptr); } \
> | 
> ^
>   ../src/tests/unit/test-io-channel-command.c:38:19: note: ‘srcargv’ was 
> declared here
>  38 | g_auto(GStrv) srcargv;
> |   ^~~
>   cc1: all warnings being treated as errors
>
> GCC version:
>
>   $ gcc --version
>   gcc (GCC) 12.2.0
>
> Fixes: 68406d10859385c88da73d0106254a7f47e6652e ('tests/unit: cleanups for 
> test-io-channel-command')
> Signed-off-by: Bernhard Beschow 
> ---
>  tests/unit/test-io-channel-command.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/tests/unit/test-io-channel-command.c 
> b/tests/unit/test-io-channel-command.c
> index 43e29c8cfb..ba0717d3c3 100644
> --- a/tests/unit/test-io-channel-command.c
> +++ b/tests/unit/test-io-channel-command.c
> @@ -35,8 +35,8 @@ static void test_io_channel_command_fifo(bool async)
>  g_autofree gchar *fifo = g_strdup_printf("%s/%s", tmpdir, TEST_FIFO);
>  g_autoptr(GString) srcargs = g_string_new(socat);
>  g_autoptr(GString) dstargs = g_string_new(socat);
> -g_auto(GStrv) srcargv;
> -g_auto(GStrv) dstargv;
> +g_auto(GStrv) srcargv = NULL;
> +g_auto(GStrv) dstargv = NULL;
>  QIOChannel *src, *dst;
>  QIOChannelTest *test;

Another approach would be to drop the GString usage which is premature
and then we can allocate everything in order:

--8<---cut here---start->8---
modified   tests/unit/test-io-channel-command.c
@@ -33,19 +33,13 @@ static void test_io_channel_command_fifo(bool async)
 {
 g_autofree gchar *tmpdir = g_dir_make_tmp("qemu-test-io-channel.XX", 
NULL);
 g_autofree gchar *fifo = g_strdup_printf("%s/%s", tmpdir, TEST_FIFO);
-g_autoptr(GString) srcargs = g_string_new(socat);
-g_autoptr(GString) dstargs = g_string_new(socat);
-g_auto(GStrv) srcargv;
-g_auto(GStrv) dstargv;
+g_autofree gchar *srcargs = g_strdup_printf("%s - PIPE:%s,wronly", socat, 
fifo);
+g_autofree gchar *dstargs = g_strdup_printf("%s PIPE:%s,rdonly -", socat, 
fifo);
+g_auto(GStrv) srcargv = g_strsplit(srcargs, " ", -1);
+g_auto(GStrv) dstargv = g_strsplit(dstargs, " ", -1);
 QIOChannel *src, *dst;
 QIOChannelTest *test;
 
-g_string_append_printf(srcargs, " - PIPE:%s,wronly", fifo);
-g_string_append_printf(dstargs, " PIPE:%s,rdonly -", fifo);
-
-srcargv = g_strsplit(srcargs->str, " ", -1);
-dstargv = g_strsplit(dstargs->str, " ", -1);
-
 src = QIO_CHANNEL(qio_channel_command_new_spawn((const char **) srcargv,
 O_WRONLY,
--8<---cut here---end--->8---




-- 
Alex Bennée

[PATCH for 7.2] Fix broken configure with -Wunused-parameter

2022-11-02 Thread Stefan Weil via

The configure script fails because it tries to compile small C programs
with a main function which is declared with arguments argc and argv
although those arguments are unused.

Running `configure -extra-cflags=-Wunused-parameter` triggers the problem.
configure for a native build does abort but shows the error in config.log.
A cross build configure for Windows with Debian stable aborts with an
error.

Avoiding unused arguments fixes this.

Signed-off-by: Stefan Weil 
---

See https://gitlab.com/qemu-project/qemu/-/issues/1295.

I noticed the problem because I often compile with -Wextra.

Stefan

 configure | 7 ---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/configure b/configure
index 4275f5419f..1106c04fea 100755
--- a/configure
+++ b/configure
@@ -1258,6 +1258,7 @@ if test "$stack_protector" != "no"; then
   cat > $TMPC << EOF
 int main(int argc, char *argv[])
 {
+(void)argc;
 char arr[64], *p = arr, *c = argv[0];
 while (*c) {
 *p++ = *c++;
@@ -1607,7 +1608,7 @@ fi
 
 if test "$safe_stack" = "yes"; then
 cat > $TMPC << EOF
-int main(int argc, char *argv[])
+int main(void)
 {
 #if ! __has_feature(safe_stack)
 #error SafeStack Disabled
@@ -1629,7 +1630,7 @@ EOF
   fi
 else
 cat > $TMPC << EOF
-int main(int argc, char *argv[])
+int main(void)
 {
 #if defined(__has_feature)
 #if __has_feature(safe_stack)
@@ -1675,7 +1676,7 @@ static const int Z = 1;
 #define TAUT(X) ((X) == Z)
 #define PAREN(X, Y) (X == Y)
 #define ID(X) (X)
-int main(int argc, char *argv[])
+int main(void)
 {
 int x = 0, y = 0;
 x = ID(x);
-- 
2.30.2

Re: [PULL 59/62] hw/block/pflash_cfi0{1, 2}: Error out if device length isn't a power of two

2022-11-02 Thread Daniel Henrique Barboza





On 11/1/22 19:49, Philippe Mathieu-Daudé wrote:

On 1/11/22 23:23, Stefan Hajnoczi wrote:

There is a report that this commit breaks an existing OVMF setup:
https://gitlab.com/qemu-project/qemu/-/issues/1290#note_1156507334

I'm not familiar with pflash. Please find a way to avoid a regression
in QEMU 7.2 here.


Long-standing problem with pflash and underlying images... i.e:
https://lore.kernel.org/qemu-devel/20190308062455.29755-1-arm...@redhat.com/

Let's revert for 7.2. Daniel, I can prepare a patch explaining.


I appreciate if you can send a revert with the proper explanation. I can make
a PR with it.


Daniel

Re: [PULL v2 00/82] pci,pc,virtio: features, tests, fixes, cleanups

2022-11-02 Thread Stefan Hajnoczi

On Wed, Nov 02, 2022 at 12:02:14PM -0400, Michael S. Tsirkin wrote:
> Changes from v1:
> 
> Applied and squashed fixes by Igor, Lei He, Hesham Almatary for
> bugs that tripped up the pipeline.
> Updated expected files for core-count test.

Several "make check" CI failures have occurred. They look like they are
related. Here is one (see the URLs at the bottom of this email for more
details):

17/106 ERROR:../tests/qtest/qos-test.c:191:subprocess_run_one_test: child 
process 
(/arm/virt/virtio-mmio/virtio-bus/virtio-net-device/virtio-net/virtio-net-tests/vhost-user/flags-mismatch/subprocess
 [8609]) failed unexpectedly ERROR
 17/106 qemu:qtest+qtest-arm / qtest-arm/qos-test ERROR 
 31.44s   killed by signal 6 SIGABRT
>>> G_TEST_DBUS_DAEMON=/builds/qemu-project/qemu/tests/dbus-vmstate-daemon.sh 
>>> MALLOC_PERTURB_=49 QTEST_QEMU_IMG=./qemu-img 
>>> QTEST_QEMU_BINARY=./qemu-system-arm 
>>> QTEST_QEMU_STORAGE_DAEMON_BINARY=./storage-daemon/qemu-storage-daemon 
>>> /builds/qemu-project/qemu/build/tests/qtest/qos-test --tap -k
― ✀  ―
stderr:
qemu-system-arm: Failed to write msg. Wrote -1 instead of 20.
qemu-system-arm: vhost VQ 0 ring restore failed: -22: Invalid argument (22)
qemu-system-arm: Failed to set msg fds.
qemu-system-arm: vhost VQ 1 ring restore failed: -22: Invalid argument (22)
qemu-system-arm: -chardev 
socket,id=chr-reconnect,path=/tmp/vhost-test-6PT2U1/reconnect.sock,server=on: 
info: QEMU waiting for connection on: 
disconnected:unix:/tmp/vhost-test-6PT2U1/reconnect.sock,server=on
qemu-system-arm: Failed to write msg. Wrote -1 instead of 20.
qemu-system-arm: vhost VQ 0 ring restore failed: -22: Invalid argument (22)
qemu-system-arm: Failed to set msg fds.
qemu-system-arm: vhost VQ 1 ring restore failed: -22: Invalid argument (22)
qemu-system-arm: -chardev 
socket,id=chr-connect-fail,path=/tmp/vhost-test-H8G7U1/connect-fail.sock,server=on:
 info: QEMU waiting for connection on: 
disconnected:unix:/tmp/vhost-test-H8G7U1/connect-fail.sock,server=on
qemu-system-arm: -netdev 
vhost-user,id=hs0,chardev=chr-connect-fail,vhostforce=on: Failed to read msg 
header. Read 0 instead of 12. Original request 1.
qemu-system-arm: -netdev 
vhost-user,id=hs0,chardev=chr-connect-fail,vhostforce=on: vhost_backend_init 
failed: Protocol error
qemu-system-arm: -netdev 
vhost-user,id=hs0,chardev=chr-connect-fail,vhostforce=on: failed to init 
vhost_net for queue 0
qemu-system-arm: -netdev 
vhost-user,id=hs0,chardev=chr-connect-fail,vhostforce=on: info: QEMU waiting 
for connection on: 
disconnected:unix:/tmp/vhost-test-H8G7U1/connect-fail.sock,server=on
qemu-system-arm: Failed to write msg. Wrote -1 instead of 20.
qemu-system-arm: vhost VQ 0 ring restore failed: -22: Invalid argument (22)
qemu-system-arm: Failed to set msg fds.
qemu-system-arm: vhost VQ 1 ring restore failed: -22: Invalid argument (22)
qemu-system-arm: -chardev 
socket,id=chr-flags-mismatch,path=/tmp/vhost-test-94UYU1/flags-mismatch.sock,server=on:
 info: QEMU waiting for connection on: 
disconnected:unix:/tmp/vhost-test-94UYU1/flags-mismatch.sock,server=on
qemu-system-arm: Failed to write msg. Wrote -1 instead of 52.
qemu-system-arm: vhost_set_mem_table failed: Invalid argument (22)
qemu-system-arm: Failed to set msg fds.
qemu-system-arm: vhost VQ 0 ring restore failed: -22: Invalid argument (22)
UndefinedBehaviorSanitizer:DEADLYSIGNAL
==8618==ERROR: UndefinedBehaviorSanitizer: SEGV on unknown address 
0x (pc 0x55e34deccab0 bp 0x sp 0x7ffc94894710 T8618)
==8618==The signal is caused by a READ memory access.
==8618==Hint: address points to the zero page.
#0 0x55e34deccab0 in ldl_he_p 
/builds/qemu-project/qemu/include/qemu/bswap.h:301:5
#1 0x55e34deccab0 in ldn_he_p 
/builds/qemu-project/qemu/include/qemu/bswap.h:440:1
#2 0x55e34deccab0 in flatview_write_continue 
/builds/qemu-project/qemu/build/../softmmu/physmem.c:2824:19
#3 0x55e34dec9f21 in flatview_write 
/builds/qemu-project/qemu/build/../softmmu/physmem.c:2867:12
#4 0x55e34dec9f21 in address_space_write 
/builds/qemu-project/qemu/build/../softmmu/physmem.c:2963:18
#5 0x55e34decace7 in address_space_unmap 
/builds/qemu-project/qemu/build/../softmmu/physmem.c:3306:9
#6 0x55e34de6d4ec in vhost_memory_unmap 
/builds/qemu-project/qemu/build/../hw/virtio/vhost.c:342:9
#7 0x55e34de6d4ec in vhost_virtqueue_stop 
/builds/qemu-project/qemu/build/../hw/virtio/vhost.c:1242:5
#8 0x55e34de72904 in vhost_dev_stop 
/builds/qemu-project/qemu/build/../hw/virtio/vhost.c:1882:9
#9 0x55e34d890514 in vhost_net_stop_one 
/builds/qemu-project/qemu/build/../hw/net/vhost_net.c:331:5
#10 0x55e34d88fef6 in vhost_net_start 
/builds/qemu-project/qemu/build/../hw/net/vhost_net.c:404:13
#11 0x55e34de0bec6 in virtio_net_vhost_status 
/builds/qemu-project/qemu/build/../hw/net/virtio-net.c:307:13
#12 0x55e34de0bec6 in virtio_net_set_status

Re: [PATCH 2/2] file-posix: add statx(STATX_DIOALIGN) support

2022-11-02 Thread Stefan Hajnoczi

On Tue, Nov 01, 2022 at 08:32:30PM -0700, Eric Biggers wrote:
> On Tue, Nov 01, 2022 at 03:00:31PM -0400, Stefan Hajnoczi wrote:
> >  /* Let's try to use the logical blocksize for the alignment. */
> > -if (probe_logical_blocksize(fd, >bl.request_alignment) < 0) {
> > -bs->bl.request_alignment = 0;
> > +if (!bs->bl.request_alignment) {
> > +if (probe_logical_blocksize(fd, >bl.request_alignment) < 0) {
> > +bs->bl.request_alignment = 0;
> > +}
> >  }
> >  
> >  #ifdef __linux__
> > -/*
> > - * The XFS ioctl definitions are shipped in extra packages that might
> > - * not always be available. Since we just need the XFS_IOC_DIOINFO 
> > ioctl
> > - * here, we simply use our own definition instead:
> > - */
> > -struct xfs_dioattr {
> > -uint32_t d_mem;
> > -uint32_t d_miniosz;
> > -uint32_t d_maxiosz;
> > -} da;
> > -if (ioctl(fd, _IOR('X', 30, struct xfs_dioattr), ) >= 0) {
> > -bs->bl.request_alignment = da.d_miniosz;
> > -/* The kernel returns wrong information for d_mem */
> > -/* s->buf_align = da.d_mem; */
> > +if (!bs->bl.request_alignment) {
> 
> This patch changes the fallback code to make the request_alignment value from
> probe_logical_blocksize() override the value from XFS_IOC_DIOINFO.  Is that
> intentional?

Thanks for pointing out the bug. That was not intentional. Will fix.

> > +if (ioctl(fd, _IOR('X', 30, struct xfs_dioattr), ) >= 0) {
> > +bs->bl.request_alignment = da.d_miniosz;
> > +/* The kernel returns wrong information for d_mem */
> > +/* s->buf_align = da.d_mem; */
> 
> Has this bug been reported to the XFS developers (linux-...@vger.kernel.org)?

Paolo: Do you remember if you reported this when you wrote commit
c25f53b06eba ("raw: Probe required direct I/O alignment")?

Stefan


signature.asc
Description: PGP signature

Re: [PATCH 1/2] file-posix: fix Linux alignment probing when EIO is returned

2022-11-02 Thread Stefan Hajnoczi

On Tue, Nov 01, 2022 at 07:49:20PM -0700, Eric Biggers wrote:
> On Tue, Nov 01, 2022 at 07:27:16PM -0700, Eric Biggers wrote:
> > On Tue, Nov 01, 2022 at 03:00:30PM -0400, Stefan Hajnoczi wrote:
> > > Linux dm-crypt returns errno EIO from unaligned O_DIRECT pread(2) calls.
> > 
> > Citation needed.  For direct I/O to block devices, the kernel's block layer
> > checks the alignment before the I/O is actually submitted to the underlying
> > block device.  See
> > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/block/fops.c?h=v6.1-rc3#n306
> > 
> > > Buglink: https://gitlab.com/qemu-project/qemu/-/issues/1290
> > 
> > That "bug" seems to be based on a misunderstanding of the kernel source 
> > code,
> > and not any actual testing.
> > 
> > I just tested it, and the error code is EINVAL.
> > 
> 
> I think I see what's happening.  The kernel code was broken just a few months
> ago, in v6.0 by the commit "block: relax direct io memory alignment"
> (https://git.kernel.org/linus/b1a000d3b8ec582d).  Now the block layer lets DIO
> through when the user buffer is only aligned to the device's dma_alignment.  
> But
> a dm-crypt device has a dma_alignment of 512 even when the crypto sector size
> (and thus also the logical block size) is 4096.  So there is now a case where
> misaligned DIO can reach dm-crypt, when that shouldn't be possible.
> 
> It also means that STATX_DIOALIGN will give the wrong value for
> stx_dio_mem_align in the above case, 512 instead of 4096.  This is because
> STATX_DIOALIGN for block devices relies on the dma_alignment.
> 
> I'll raise this on the linux-block and dm-devel mailing lists.  It would be 
> nice
> if people reported kernel bugs instead of silently working around them...

Thanks! You have completed the picture of what's going on here.

Stefan


signature.asc
Description: PGP signature

Re: [PATCH 3/3] target/tricore: Rename csfr.def -> csfr.h.inc

2022-11-02 Thread Laurent Vivier


Le 26/10/2022 à 01:50, Philippe Mathieu-Daudé a écrit :

We use the .h.inc extension to include C headers. To be consistent
with the rest of the codebase, rename the C headers using the .def
extension.

IDE/tools using our .editorconfig / .gitattributes will leverage
this consistency.

Signed-off-by: Philippe Mathieu-Daudé 
---
  target/tricore/{csfr.def => csfr.h.inc} | 0
  target/tricore/translate.c  | 4 ++--
  2 files changed, 2 insertions(+), 2 deletions(-)
  rename target/tricore/{csfr.def => csfr.h.inc} (100%)

diff --git a/target/tricore/csfr.def b/target/tricore/csfr.h.inc
similarity index 100%
rename from target/tricore/csfr.def
rename to target/tricore/csfr.h.inc
diff --git a/target/tricore/translate.c b/target/tricore/translate.c
index a0558ead71..f02090945d 100644
--- a/target/tricore/translate.c
+++ b/target/tricore/translate.c
@@ -388,7 +388,7 @@ static inline void gen_mfcr(DisasContext *ctx, TCGv ret, 
int32_t offset)
  gen_helper_psw_read(ret, cpu_env);
  } else {
  switch (offset) {
-#include "csfr.def"
+#include "csfr.h.inc"
  }
  }
  }
@@ -418,7 +418,7 @@ static inline void gen_mtcr(DisasContext *ctx, TCGv r1,
  gen_helper_psw_write(cpu_env, r1);
  } else {
  switch (offset) {
-#include "csfr.def"
+#include "csfr.h.inc"
  }
  }
  } else {


Applied to my trivial-patches branch.

Thanks,
Laurent

Re: [PATCH 2/3] target/s390x: Rename insn-data/format.def -> insn-data/format.h.inc

2022-11-02 Thread Laurent Vivier


Le 26/10/2022 à 01:50, Philippe Mathieu-Daudé a écrit :

We use the .h.inc extension to include C headers. To be consistent
with the rest of the codebase, rename the C headers using the .def
extension.

IDE/tools using our .editorconfig / .gitattributes will leverage
this consistency.

Signed-off-by: Philippe Mathieu-Daudé 
---
  target/s390x/tcg/{insn-data.def => insn-data.h.inc}|  2 +-
  .../s390x/tcg/{insn-format.def => insn-format.h.inc}   |  0
  target/s390x/tcg/translate.c   | 10 +-
  3 files changed, 6 insertions(+), 6 deletions(-)
  rename target/s390x/tcg/{insn-data.def => insn-data.h.inc} (99%)
  rename target/s390x/tcg/{insn-format.def => insn-format.h.inc} (100%)

diff --git a/target/s390x/tcg/insn-data.def b/target/s390x/tcg/insn-data.h.inc
similarity index 99%
rename from target/s390x/tcg/insn-data.def
rename to target/s390x/tcg/insn-data.h.inc
index 6382ceabfc..7e952bdfc8 100644
--- a/target/s390x/tcg/insn-data.def
+++ b/target/s390x/tcg/insn-data.h.inc
@@ -8,7 +8,7 @@
   *
   *  OPC  = (op << 8) | op2 where op is the major, op2 the minor opcode
   *  NAME = name of the opcode, used internally
- *  FMT  = format of the opcode (defined in insn-format.def)
+ *  FMT  = format of the opcode (defined in insn-format.h.inc)
   *  FAC  = facility the opcode is available in (defined in DisasFacility)
   *  I1   = func in1_xx fills o->in1
   *  I2   = func in2_xx fills o->in2
diff --git a/target/s390x/tcg/insn-format.def 
b/target/s390x/tcg/insn-format.h.inc
similarity index 100%
rename from target/s390x/tcg/insn-format.def
rename to target/s390x/tcg/insn-format.h.inc
diff --git a/target/s390x/tcg/translate.c b/target/s390x/tcg/translate.c
index 1d2dddab1c..f378e1a633 100644
--- a/target/s390x/tcg/translate.c
+++ b/target/s390x/tcg/translate.c
@@ -1011,7 +1011,7 @@ static void free_compare(DisasCompare *c)
  #define F6(N, X1, X2, X3, X4, X5, X6) F0(N)
  
  typedef enum {

-#include "insn-format.def"
+#include "insn-format.h.inc"
  } DisasFormat;
  
  #undef F0

@@ -1076,7 +1076,7 @@ typedef struct DisasFormatInfo {
  #define F6(N, X1, X2, X3, X4, X5, X6)   { { X1, X2, X3, X4, X5, X6 } },
  
  static const DisasFormatInfo format_info[] = {

-#include "insn-format.def"
+#include "insn-format.h.inc"
  };
  
  #undef F0

@@ -6143,7 +6143,7 @@ static void in2_insn(DisasContext *s, DisasOps *o)
  #define E(OPC, NM, FT, FC, I1, I2, P, W, OP, CC, D, FL) insn_ ## NM,
  
  enum DisasInsnEnum {

-#include "insn-data.def"
+#include "insn-data.h.inc"
  };
  
  #undef E

@@ -6223,7 +6223,7 @@ enum DisasInsnEnum {
  #define FAC_MIE3S390_FEAT_MISC_INSTRUCTION_EXT3 /* 
miscellaneous-instruction-extensions facility 3 */
  
  static const DisasInsn insn_info[] = {

-#include "insn-data.def"
+#include "insn-data.h.inc"
  };
  
  #undef E

@@ -6233,7 +6233,7 @@ static const DisasInsn insn_info[] = {
  static const DisasInsn *lookup_opc(uint16_t opc)
  {
  switch (opc) {
-#include "insn-data.def"
+#include "insn-data.h.inc"
  default:
  return NULL;
  }


Applied to my trivial-patches branch.

Thanks,
Laurent

Re: [PATCH 1/3] target/m68k: Rename qregs.def -> qregs.h.inc

2022-11-02 Thread Laurent Vivier


Le 26/10/2022 à 01:50, Philippe Mathieu-Daudé a écrit :

We use the .h.inc extension to include C headers. To be consistent
with the rest of the codebase, rename the C headers using the .def
extension.

IDE/tools using our .editorconfig / .gitattributes will leverage
this consistency.

Signed-off-by: Philippe Mathieu-Daudé 
---
  target/m68k/{qregs.def => qregs.h.inc} | 0
  target/m68k/translate.c| 4 ++--
  2 files changed, 2 insertions(+), 2 deletions(-)
  rename target/m68k/{qregs.def => qregs.h.inc} (100%)

diff --git a/target/m68k/qregs.def b/target/m68k/qregs.h.inc
similarity index 100%
rename from target/m68k/qregs.def
rename to target/m68k/qregs.h.inc
diff --git a/target/m68k/translate.c b/target/m68k/translate.c
index 9df17aa4b2..f018fa9eb0 100644
--- a/target/m68k/translate.c
+++ b/target/m68k/translate.c
@@ -39,7 +39,7 @@
  
  #define DEFO32(name, offset) static TCGv QREG_##name;

  #define DEFO64(name, offset) static TCGv_i64 QREG_##name;
-#include "qregs.def"
+#include "qregs.h.inc"
  #undef DEFO32
  #undef DEFO64
  
@@ -75,7 +75,7 @@ void m68k_tcg_init(void)

  #define DEFO64(name, offset) \
  QREG_##name = tcg_global_mem_new_i64(cpu_env, \
  offsetof(CPUM68KState, offset), #name);
-#include "qregs.def"
+#include "qregs.h.inc"
  #undef DEFO32
  #undef DEFO64
  


Applied to my trivial-patches branch.

Thanks,
Laurent

Re: [PATCH 3/3] libvhost-user: Add format attribute to local function vu_panic

2022-11-02 Thread Laurent Vivier


Le 22/04/2022 à 09:01, Stefan Weil a écrit :

Signed-off-by: Stefan Weil 
---

It would be good to add format attributes to local functions, too (like
it is done here) to avoid future format bugs.

The changes here could be simplified by including a glib header,
but from the comments I assumed that is unwanted here?

  subprojects/libvhost-user/libvhost-user.c | 13 -
  1 file changed, 12 insertions(+), 1 deletion(-)

diff --git a/subprojects/libvhost-user/libvhost-user.c 
b/subprojects/libvhost-user/libvhost-user.c
index 94645f9154..29ab85fc9d 100644
--- a/subprojects/libvhost-user/libvhost-user.c
+++ b/subprojects/libvhost-user/libvhost-user.c
@@ -45,6 +45,17 @@
  #include "libvhost-user.h"
  
  /* usually provided by GLib */

+#if __GNUC__ > 2 || (__GNUC__ == 2 && __GNUC_MINOR__ > 4)
+#if !defined(__clang__) && (__GNUC__ == 4 && __GNUC_MINOR__ == 4)
+#define G_GNUC_PRINTF(format_idx, arg_idx) \
+  __attribute__((__format__(gnu_printf, format_idx, arg_idx)))
+#else
+#define G_GNUC_PRINTF(format_idx, arg_idx) \
+  __attribute__((__format__(__printf__, format_idx, arg_idx)))
+#endif
+#else   /* !__GNUC__ */
+#define G_GNUC_PRINTF(format_idx, arg_idx)
+#endif  /* !__GNUC__ */
  #ifndef MIN
  #define MIN(x, y) ({\
  typeof(x) _min1 = (x);  \
@@ -151,7 +162,7 @@ vu_request_to_string(unsigned int req)
  }
  }
  
-static void

+static void G_GNUC_PRINTF(2, 3)
  vu_panic(VuDev *dev, const char *msg, ...)
  {
  char *buf = NULL;


Applied to my trivial-patches branch.

Thanks,
Laurent

Re: [PATCH 2/3] libvhost-user: Fix format strings

2022-11-02 Thread Laurent Vivier


Le 22/04/2022 à 09:01, Stefan Weil a écrit :

Signed-off-by: Stefan Weil 
---
  subprojects/libvhost-user/libvhost-user.c | 4 ++--
  1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/subprojects/libvhost-user/libvhost-user.c 
b/subprojects/libvhost-user/libvhost-user.c
index 2d29140a8f..94645f9154 100644
--- a/subprojects/libvhost-user/libvhost-user.c
+++ b/subprojects/libvhost-user/libvhost-user.c
@@ -700,7 +700,7 @@ vu_add_mem_reg(VuDev *dev, VhostUserMsg *vmsg) {
  if (vmsg->size < VHOST_USER_MEM_REG_SIZE) {
  close(vmsg->fds[0]);
  vu_panic(dev, "VHOST_USER_ADD_MEM_REG requires a message size of at "
-  "least %d bytes and only %d bytes were received",
+  "least %zu bytes and only %d bytes were received",
VHOST_USER_MEM_REG_SIZE, vmsg->size);
  return false;
  }
@@ -833,7 +833,7 @@ vu_rem_mem_reg(VuDev *dev, VhostUserMsg *vmsg) {
  if (vmsg->size < VHOST_USER_MEM_REG_SIZE) {
  close(vmsg->fds[0]);
  vu_panic(dev, "VHOST_USER_REM_MEM_REG requires a message size of at "
-  "least %d bytes and only %d bytes were received",
+  "least %zu bytes and only %d bytes were received",
VHOST_USER_MEM_REG_SIZE, vmsg->size);
  return false;
  }


Applied to my trivial-patches branch.

Thanks,
Laurent

[PATCH] virtio-blk: simplify virtio_blk_dma_restart_cb()

2022-11-02 Thread Stefan Hajnoczi

virtio_blk_dma_restart_cb() is tricky because the BH must deal with
virtio_blk_data_plane_start()/virtio_blk_data_plane_stop() being called.

There are two issues with the code:

1. virtio_blk_realize() should use qdev_add_vm_change_state_handler()
   instead of qemu_add_vm_change_state_handler(). This ensures the
   ordering with virtio_init()'s vm change state handler that calls
   virtio_blk_data_plane_start()/virtio_blk_data_plane_stop() is
   well-defined. Then blk's AioContext is guaranteed to be up-to-date in
   virtio_blk_dma_restart_cb() and it's no longer necessary to have a
   special case for virtio_blk_data_plane_start().

2. Only blk_drain() waits for virtio_blk_dma_restart_cb()'s
   blk_inc_in_flight() to be decremented. The bdrv_drain() family of
   functions do not wait for BlockBackend's in_flight counter to reach
   zero. virtio_blk_data_plane_stop() relies on blk_set_aio_context()'s
   implicit drain, but that's a bdrv_drain() and not a blk_drain().
   Note that virtio_blk_reset() already correctly relies on blk_drain().
   If virtio_blk_data_plane_stop() switches to blk_drain() then we can
   properly wait for pending virtio_blk_dma_restart_bh() calls.

Once these issues are taken care of the code becomes simpler. This
change is in preparation for multiple IOThreads in virtio-blk where we
need to clean up the multi-threading behavior.

I ran the reproducer from commit 49b44549ace7 ("virtio-blk: On restart,
process queued requests in the proper context") to check that there is
no regression.

Cc: Sergio Lopez 
Cc: Kevin Wolf 
Cc: Emanuele Giuseppe Esposito 
Signed-off-by: Stefan Hajnoczi 
---
 include/hw/virtio/virtio-blk.h  |  2 --
 hw/block/dataplane/virtio-blk.c | 17 +---
 hw/block/virtio-blk.c   | 46 ++---
 3 files changed, 26 insertions(+), 39 deletions(-)

diff --git a/include/hw/virtio/virtio-blk.h b/include/hw/virtio/virtio-blk.h
index 7f589b4146..dafec432ce 100644
--- a/include/hw/virtio/virtio-blk.h
+++ b/include/hw/virtio/virtio-blk.h
@@ -55,7 +55,6 @@ struct VirtIOBlock {
 VirtIODevice parent_obj;
 BlockBackend *blk;
 void *rq;
-QEMUBH *bh;
 VirtIOBlkConf conf;
 unsigned short sector_mask;
 bool original_wce;
@@ -93,6 +92,5 @@ typedef struct MultiReqBuffer {
 } MultiReqBuffer;
 
 void virtio_blk_handle_vq(VirtIOBlock *s, VirtQueue *vq);
-void virtio_blk_process_queued_requests(VirtIOBlock *s, bool is_bh);
 
 #endif
diff --git a/hw/block/dataplane/virtio-blk.c b/hw/block/dataplane/virtio-blk.c
index 26f965cabc..b28d81737e 100644
--- a/hw/block/dataplane/virtio-blk.c
+++ b/hw/block/dataplane/virtio-blk.c
@@ -237,9 +237,6 @@ int virtio_blk_data_plane_start(VirtIODevice *vdev)
 goto fail_aio_context;
 }
 
-/* Process queued requests before the ones in vring */
-virtio_blk_process_queued_requests(vblk, false);
-
 /* Kick right away to begin processing requests already in vring */
 for (i = 0; i < nvqs; i++) {
 VirtQueue *vq = virtio_get_queue(s->vdev, i);
@@ -272,11 +269,6 @@ int virtio_blk_data_plane_start(VirtIODevice *vdev)
   fail_host_notifiers:
 k->set_guest_notifiers(qbus->parent, nvqs, false);
   fail_guest_notifiers:
-/*
- * If we failed to set up the guest notifiers queued requests will be
- * processed on the main context.
- */
-virtio_blk_process_queued_requests(vblk, false);
 vblk->dataplane_disabled = true;
 s->starting = false;
 vblk->dataplane_started = true;
@@ -325,8 +317,13 @@ void virtio_blk_data_plane_stop(VirtIODevice *vdev)
 aio_context_acquire(s->ctx);
 aio_wait_bh_oneshot(s->ctx, virtio_blk_data_plane_stop_bh, s);
 
-/* Drain and try to switch bs back to the QEMU main loop. If other users
- * keep the BlockBackend in the iothread, that's ok */
+/* Wait for virtio_blk_dma_restart_bh() and in flight I/O to complete */
+blk_drain(s->conf->conf.blk);
+
+/*
+ * Try to switch bs back to the QEMU main loop. If other users keep the
+ * BlockBackend in the iothread, that's ok
+ */
 blk_set_aio_context(s->conf->conf.blk, qemu_get_aio_context(), NULL);
 
 aio_context_release(s->ctx);
diff --git a/hw/block/virtio-blk.c b/hw/block/virtio-blk.c
index f717550fdc..1762517878 100644
--- a/hw/block/virtio-blk.c
+++ b/hw/block/virtio-blk.c
@@ -806,8 +806,10 @@ static void virtio_blk_handle_output(VirtIODevice *vdev, 
VirtQueue *vq)
 virtio_blk_handle_vq(s, vq);
 }
 
-void virtio_blk_process_queued_requests(VirtIOBlock *s, bool is_bh)
+static void virtio_blk_dma_restart_bh(void *opaque)
 {
+VirtIOBlock *s = opaque;
+
 VirtIOBlockReq *req = s->rq;
 MultiReqBuffer mrb = {};
 
@@ -834,43 +836,27 @@ void virtio_blk_process_queued_requests(VirtIOBlock *s, 
bool is_bh)
 if (mrb.num_reqs) {
 virtio_blk_submit_multireq(s, );
 }
-if (is_bh) {
-blk_dec_in_flight(s->conf.conf.blk);
-}
+
+/* Paired with inc in virtio_blk_dma_restart_cb() */
+

Re: [PATCH 1/3] libvhost-user: Fix wrong type of argument to formatting function (reported by LGTM)

2022-11-02 Thread Laurent Vivier


Le 22/04/2022 à 09:01, Stefan Weil a écrit :

Reviewed-by: Philippe Mathieu-Daudé 
Signed-off-by: Stefan Weil 
---

This patch was already sent to the list and got reviewed, but missed
release 7.0.0.

  subprojects/libvhost-user/libvhost-user.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/subprojects/libvhost-user/libvhost-user.c 
b/subprojects/libvhost-user/libvhost-user.c
index 47d2efc60f..2d29140a8f 100644
--- a/subprojects/libvhost-user/libvhost-user.c
+++ b/subprojects/libvhost-user/libvhost-user.c
@@ -651,7 +651,7 @@ generate_faults(VuDev *dev) {
  
  if (ioctl(dev->postcopy_ufd, UFFDIO_REGISTER, _struct)) {

  vu_panic(dev, "%s: Failed to userfault region %d "
-  "@%p + size:%zx offset: %zx: (ufd=%d)%s\n",
+  "@%" PRIx64 " + size:%zx offset: %zx: (ufd=%d)%s\n",
   __func__, i,
   dev_region->mmap_addr,
   dev_region->size, dev_region->mmap_offset,


Applied to my trivial-patches branch.

Thanks,
Laurent

Re: [PATCH v4 1/2] xen/pt: fix syntax error that causes FTBFS in some configurations

2022-11-02 Thread Laurent Vivier


Le 31/10/2022 à 22:35, Chuck Zmudzinski a écrit :

When Qemu is built with --enable-xen and --disable-xen-pci-passthrough
and the target os is linux, the build fails with:

meson.build:3477:2: ERROR: File xen_pt_stub.c does not exist.

Fixes: 582ea95f5f93 ("meson: convert hw/xen")

Signed-off-by: Chuck Zmudzinski 
---
v2: Remove From:  tag at top of commit message

v3: No change to this patch since v2

v4: Use brchu...@aol.com instead of brchu...@netscape.net for the author's
 email address to match the address used by the same author in commits
 be9c61da and c0e86b76

  hw/xen/meson.build | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/hw/xen/meson.build b/hw/xen/meson.build
index 08dc1f6857..ae0ace3046 100644
--- a/hw/xen/meson.build
+++ b/hw/xen/meson.build
@@ -18,7 +18,7 @@ if have_xen_pci_passthrough
  'xen_pt_msi.c',
))
  else
-  xen_specific_ss.add('xen_pt_stub.c')
+  xen_specific_ss.add(files('xen_pt_stub.c'))
  endif
  
  specific_ss.add_all(when: ['CONFIG_XEN', xen], if_true: xen_specific_ss)


Applied to my trivial-patches branch.

Thanks,
Laurent

Re: [PATCH] Add nsis.py to W32/W64 section in MAINTAINERS

2022-11-02 Thread Laurent Vivier


Le 31/10/2022 à 14:17, Stefan Weil via a écrit :

Am 31.10.22 um 11:28 schrieb Philippe Mathieu-Daudé:


On 31/10/22 10:57, Stefan Weil via wrote:

Signed-off-by: Stefan Weil 
---
  MAINTAINERS | 1 +
  1 file changed, 1 insertion(+)


Reviewed-by: Philippe Mathieu-Daudé 



Cc qemu-trivial





If I'm right, this change has already been merged by:

commit 48fad83ff49bd47368223cf1121351f51cf3565f
Author: Alex Bennée 
Date:   Thu Oct 27 19:36:21 2022 +0100

MAINTAINERS: add entries for the key build bits

Changes to the build files are a bit special in that they usually go
through other maintainer trees. However considering the build system
is the root of everything a developer is likely to do we should at
least set it out in MAINTAINERS.

I'm going to nominate Paolo for meson stuff given the conversion was
his passion project. I'm happy to cast an eye over configure stuff
considering a lot of the cross compile logic is in there anyway.

Reviewed-by: Philippe Mathieu-Daudé 
Signed-off-by: Alex Bennée 
Acked-by: Thomas Huth 
Cc: Paolo Bonzini 
Message-Id: <20221027183637.2772968-16-alex.ben...@linaro.org>

Thanks,
Laurent

Re: [PATCH v2] Fix some typos in documentation and comments

2022-11-02 Thread Laurent Vivier


Le 31/10/2022 à 14:22, Stefan Weil via a écrit :

Am 31.10.22 um 08:35 schrieb Thomas Huth:


On 30/10/2022 11.59, Stefan Weil wrote:

Most of them were found and fixed using codespell.

Signed-off-by: Stefan Weil 
---

v2: Fixes from Peter Maydell's comments

My focus was fixing typos which are relevant for the generated documentation.

codespell finds many more typos in source code, and adding it to the continuous
integration checks looks more and more like a good idea.


... at least for the docs/ folder, this might indeed be a good idea.

Reviewed-by: Thomas Huth 



See also "Reviewed-by: Stefan Hajnoczi " for the first 
version of this patch.

Maybe the pull request can be made by qemu-trivial?

Thanks,

Stefan




Applied to my trivial-patches branch.

Thanks,
Laurent

Re: [PATCH] qapi: virtio: Fix the introduced version

2022-11-02 Thread Laurent Vivier


Le 01/11/2022 à 02:46, Han Han a écrit :

The items of qapi/virtio.json are introduced at a5ebce38576. They will be
in the version 7.2 not 7.1.

Signed-off-by: Han Han 
---
  qapi/virtio.json | 34 +-
  1 file changed, 17 insertions(+), 17 deletions(-)

diff --git a/qapi/virtio.json b/qapi/virtio.json
index e47a8fb2e0..872c7e3623 100644
--- a/qapi/virtio.json
+++ b/qapi/virtio.json
@@ -15,7 +15,7 @@
  #
  # @name: Name of the VirtIODevice
  #
-# Since: 7.1
+# Since: 7.2
  #
  ##
  { 'struct': 'VirtioInfo',
@@ -32,7 +32,7 @@
  #
  # Returns: List of gathered VirtIODevices
  #
-# Since: 7.1
+# Since: 7.2
  #
  # Example:
  #
@@ -97,7 +97,7 @@
  #
  # @log-size: vhost_dev log_size
  #
-# Since: 7.1
+# Since: 7.2
  #
  ##
  
@@ -167,7 +167,7 @@

  # Present if the given VirtIODevice has an active vhost
  # device.
  #
-# Since: 7.1
+# Since: 7.2
  #
  ##
  
@@ -206,7 +206,7 @@

  #
  # Returns: VirtioStatus of the virtio device
  #
-# Since: 7.1
+# Since: 7.2
  #
  # Examples:
  #
@@ -452,7 +452,7 @@
  #
  # @unknown-statuses: Virtio device statuses bitmap that have not been decoded
  #
-# Since: 7.1
+# Since: 7.2
  ##
  
  { 'struct': 'VirtioDeviceStatus',

@@ -471,7 +471,7 @@
  # @unknown-protocols: Vhost user device protocol features bitmap that
  # have not been decoded
  #
-# Since: 7.1
+# Since: 7.2
  ##
  
  { 'struct': 'VhostDeviceProtocols',

@@ -492,7 +492,7 @@
  # @unknown-dev-features: Virtio device features bitmap that have not
  #been decoded
  #
-# Since: 7.1
+# Since: 7.2
  ##
  
  { 'struct': 'VirtioDeviceFeatures',

@@ -535,7 +535,7 @@
  #
  # @signalled-used-valid: VirtQueue signalled_used_valid flag
  #
-# Since: 7.1
+# Since: 7.2
  #
  ##
  
@@ -576,7 +576,7 @@

  #shadow_avail_idx will not be displayed in the case where
  #the selected VirtIODevice has a running vhost device.
  #
-# Since: 7.1
+# Since: 7.2
  #
  # Examples:
  #
@@ -666,7 +666,7 @@
  #
  # @used-size: vhost_virtqueue used_size
  #
-# Since: 7.1
+# Since: 7.2
  #
  ##
  
@@ -699,7 +699,7 @@

  #
  # Returns: VirtVhostQueueStatus of the vhost_virtqueue
  #
-# Since: 7.1
+# Since: 7.2
  #
  # Examples:
  #
@@ -767,7 +767,7 @@
  #
  # @flags: List of descriptor flags
  #
-# Since: 7.1
+# Since: 7.2
  #
  ##
  
@@ -787,7 +787,7 @@

  #
  # @ring: VRingAvail ring[] entry at provided index
  #
-# Since: 7.1
+# Since: 7.2
  #
  ##
  
@@ -805,7 +805,7 @@

  #
  # @idx: VRingUsed index
  #
-# Since: 7.1
+# Since: 7.2
  #
  ##
  
@@ -829,7 +829,7 @@

  #
  # @used: VRingUsed info
  #
-# Since: 7.1
+# Since: 7.2
  #
  ##
  
@@ -857,7 +857,7 @@

  #
  # Returns: VirtioQueueElement information
  #
-# Since: 7.1
+# Since: 7.2
  #
  # Examples:
  #


Applied to my trivial-patches branch.

Thanks,
Laurent

Re: [PATCH] tests/unit/test-io-channel-command: Silence GCC error "maybe-uninitialized"

2022-11-02 Thread Laurent Vivier


Le 01/11/2022 à 22:39, Bernhard Beschow a écrit :

GCC issues a false positive warning, resulting in build failure with -Werror:

   In file included from /usr/lib/glib-2.0/include/glibconfig.h:9,
from /usr/include/glib-2.0/glib/gtypes.h:34,
from /usr/include/glib-2.0/glib/galloca.h:34,
from /usr/include/glib-2.0/glib.h:32,
from ../src/include/glib-compat.h:32,
from ../src/include/qemu/osdep.h:144,
from ../src/tests/unit/test-io-channel-command.c:21:
   /usr/include/glib-2.0/glib/gmacros.h: In function 
‘test_io_channel_command_fifo’:
   /usr/include/glib-2.0/glib/gmacros.h:1333:105: error: ‘dstargv’ may be used 
uninitialized [-Werror=maybe-uninitialized]
1333 |   static G_GNUC_UNUSED inline void _GLIB_AUTO_FUNC_NAME(TypeName) 
(TypeName *_ptr) { if (*_ptr != none) (func) (*_ptr); } \
 |  
   ^
   ../src/tests/unit/test-io-channel-command.c:39:19: note: ‘dstargv’ was 
declared here
  39 | g_auto(GStrv) dstargv;
 |   ^~~
   /usr/include/glib-2.0/glib/gmacros.h:1333:105: error: ‘srcargv’ may be used 
uninitialized [-Werror=maybe-uninitialized]
1333 |   static G_GNUC_UNUSED inline void _GLIB_AUTO_FUNC_NAME(TypeName) 
(TypeName *_ptr) { if (*_ptr != none) (func) (*_ptr); } \
 |  
   ^
   ../src/tests/unit/test-io-channel-command.c:38:19: note: ‘srcargv’ was 
declared here
  38 | g_auto(GStrv) srcargv;
 |   ^~~
   cc1: all warnings being treated as errors

GCC version:

   $ gcc --version
   gcc (GCC) 12.2.0

Fixes: 68406d10859385c88da73d0106254a7f47e6652e ('tests/unit: cleanups for 
test-io-channel-command')
Signed-off-by: Bernhard Beschow 
---
  tests/unit/test-io-channel-command.c | 4 ++--
  1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/tests/unit/test-io-channel-command.c 
b/tests/unit/test-io-channel-command.c
index 43e29c8cfb..ba0717d3c3 100644
--- a/tests/unit/test-io-channel-command.c
+++ b/tests/unit/test-io-channel-command.c
@@ -35,8 +35,8 @@ static void test_io_channel_command_fifo(bool async)
  g_autofree gchar *fifo = g_strdup_printf("%s/%s", tmpdir, TEST_FIFO);
  g_autoptr(GString) srcargs = g_string_new(socat);
  g_autoptr(GString) dstargs = g_string_new(socat);
-g_auto(GStrv) srcargv;
-g_auto(GStrv) dstargv;
+g_auto(GStrv) srcargv = NULL;
+g_auto(GStrv) dstargv = NULL;
  QIOChannel *src, *dst;
  QIOChannelTest *test;
  


Applied to my trivial-patches branch.

Thanks,
Laurent

Re: [PATCH v6 0/3] ppc/e500: Add support for eSDHC

2022-11-02 Thread Bernhard Beschow

On Tue, Nov 1, 2022 at 11:29 PM Philippe Mathieu-Daudé 
wrote:

> This is a respin of Bernhard's v4 with Freescale eSDHC implemented
> as an 'UNIMP' region. See v4 cover here:
>
> https://lore.kernel.org/qemu-devel/20221018210146.193159-1-shen...@gmail.com/
>
> Since v5:
> - Rebased (ppc-next merged)
> - Properly handle big-endian
>

Tested-by: Bernhard Beschow 
Reviewed-by: Bernhard Beschow 


> Since v4:
> - Do not rename ESDHC_* definitions to USDHC_*
> - Do not modify SDHCIState structure
>
> Supersedes: <20221031115402.91912-1-phi...@linaro.org>
>
> Philippe Mathieu-Daudé (3):
>   hw/sd/sdhci: MMIO region is implemented in 32-bit accesses
>   hw/sd/sdhci: Support big endian SD host controller interfaces
>   hw/ppc/e500: Add Freescale eSDHC to e500plat
>
>  docs/system/ppc/ppce500.rst | 13 ++
>  hw/ppc/Kconfig  |  2 ++
>  hw/ppc/e500.c   | 48 -
>  hw/ppc/e500.h   |  1 +
>  hw/ppc/e500plat.c   |  1 +
>  hw/sd/sdhci-internal.h  |  1 +
>  hw/sd/sdhci.c   | 36 +---
>  include/hw/sd/sdhci.h   |  1 +
>  8 files changed, 99 insertions(+), 4 deletions(-)
>
> --
> 2.38.1
>
>

[PATCH 1/2] target/mips: Don't check COP1X for 64 bit FP mode

2022-11-02 Thread Jiaxun Yang

Some implementations (i.e. Loongson-2F) may decide to implement a 64 bit
FPU without implmenting COP1X instructions.

As the eligibility of 64 bit FP instructions is already determined by
CP0St_FR, there is no need to check for COP1X again.

Signed-off-by: Jiaxun Yang 
---
 target/mips/tcg/translate.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/target/mips/tcg/translate.c b/target/mips/tcg/translate.c
index 2f2d707a12..e49d2a25a8 100644
--- a/target/mips/tcg/translate.c
+++ b/target/mips/tcg/translate.c
@@ -1545,7 +1545,7 @@ void check_cop1x(DisasContext *ctx)
  */
 void check_cp1_64bitmode(DisasContext *ctx)
 {
-if (unlikely(~ctx->hflags & (MIPS_HFLAG_F64 | MIPS_HFLAG_COP1X))) {
+if (unlikely(~ctx->hflags & MIPS_HFLAG_F64) {
 gen_reserved_instruction(ctx);
 }
 }
-- 
2.34.1

[PATCH 2/2] target/mips: Correct check for CABS instructions

2022-11-02 Thread Jiaxun Yang

Accroading to "MIPS Architecture for Programmers Volume IV-c:
The MIPS-3D Application-Specific Extension to the MIPS64 Architecture"
(MD00099). CABS.cond.fmt belongs to MIPS-3D ASE, and it has nothing to do
with COP1X opcode.

Remove all unnecessary COP1X checks and check for MIPS3D availability
in decoding code path.

Signed-off-by: Jiaxun Yang 
---
 target/mips/tcg/translate.c | 9 +
 1 file changed, 1 insertion(+), 8 deletions(-)

diff --git a/target/mips/tcg/translate.c b/target/mips/tcg/translate.c
index e49d2a25a8..23e575ad95 100644
--- a/target/mips/tcg/translate.c
+++ b/target/mips/tcg/translate.c
@@ -1788,16 +1788,8 @@ static inline void gen_cmp ## type ## _ ## 
fmt(DisasContext *ctx, int n,  \
 check_ps(ctx);\
 break;\
 case FMT_D:   \
-if (abs) {\
-check_cop1x(ctx); \
-} \
 check_cp1_registers(ctx, fs | ft);\
 break;\
-case FMT_S:   \
-if (abs) {\
-check_cop1x(ctx); \
-} \
-break;\
 } \
 gen_ldcmp_fpr##bits(ctx, fp0, fs);\
 gen_ldcmp_fpr##bits(ctx, fp1, ft);\
@@ -10424,6 +10416,7 @@ static void gen_farith(DisasContext *ctx, enum fopcode 
op1,
 case OPC_CMP_NGT_S:
 check_insn_opc_removed(ctx, ISA_MIPS_R6);
 if (ctx->opcode & (1 << 6)) {
+check_insn(ctx, ASE_MIPS3D);
 gen_cmpabs_s(ctx, func - 48, ft, fs, cc);
 } else {
 gen_cmp_s(ctx, func - 48, ft, fs, cc);
-- 
2.34.1

[PULL v2 00/82] pci,pc,virtio: features, tests, fixes, cleanups

2022-11-02 Thread Michael S. Tsirkin

Changes from v1:

Applied and squashed fixes by Igor, Lei He, Hesham Almatary for
bugs that tripped up the pipeline.
Updated expected files for core-count test.

The following changes since commit a11f65ec1b8adcb012b89c92819cbda4dc25aaf1:

  Merge tag 'block-pull-request' of https://gitlab.com/stefanha/qemu into 
staging (2022-11-01 13:49:33 -0400)

are available in the Git repository at:

  https://git.kernel.org/pub/scm/virt/kvm/mst/qemu.git tags/for_upstream

for you to fetch changes up to 77dd1e2b092bb92978a2d68bed7d048ed74a5d23:

  intel-iommu: PASID support (2022-11-02 07:55:26 -0400)


pci,pc,virtio: features, tests, fixes, cleanups

lots of acpi rework
first version of biosbits infrastructure
ASID support in vhost-vdpa
core_count2 support in smbios
PCIe DOE emulation
virtio vq reset
HMAT support
part of infrastructure for viommu support in vhost-vdpa
VTD PASID support
fixes, tests all over the place

Signed-off-by: Michael S. Tsirkin 


Akihiko Odaki (1):
  msix: Assert that specified vector is in range

Alex Bennée (1):
  virtio: re-order vm_running and use_started checks

Ani Sinha (7):
  hw/i386/e820: remove legacy reserved entries for e820
  acpi/tests/avocado/bits: initial commit of test scripts that are run by 
biosbits
  acpi/tests/avocado/bits: disable acpi PSS tests that are failing in 
biosbits
  acpi/tests/avocado/bits: add biosbits config file for running bios tests
  acpi/tests/avocado/bits: add acpi and smbios avocado tests that uses 
biosbits
  acpi/tests/avocado/bits/doc: add a doc file to describe the acpi bits test
  MAINTAINERS: add myself as the maintainer for acpi biosbits avocado tests

Bernhard Beschow (3):
  hw/i386/acpi-build: Remove unused struct
  hw/i386/acpi-build: Resolve redundant attribute
  hw/i386/acpi-build: Resolve north rather than south bridges

Brice Goglin (4):
  hmat acpi: Don't require initiator value in -numa
  tests: acpi: add and whitelist *.hmat-noinitiator expected blobs
  tests: acpi: q35: add test for hmat nodes without initiators
  tests: acpi: q35: update expected blobs *.hmat-noinitiators expected HMAT:

Christian A. Ehrhardt (1):
  hw/acpi/erst.c: Fix memory handling issues

Cindy Lu (1):
  vfio: move implement of vfio_get_xlat_addr() to memory.c

David Daney (1):
  virtio-rng-pci: Allow setting nvectors, so we can use MSI-X

Eric Auger (1):
  hw/virtio/virtio-iommu-pci: Enforce the device is plugged on the root bus

Gregory Price (1):
  hw/i386/pc.c: CXL Fixed Memory Window should not reserve e820 in bios

Hesham Almatary (3):
  tests: Add HMAT AArch64/virt empty table files
  tests: acpi: aarch64/virt: add a test for hmat nodes with no initiators
  tests: virt: Update expected *.acpihmatvirt tables

Huai-Cheng Kuo (3):
  hw/pci: PCIe Data Object Exchange emulation
  hw/cxl/cdat: CXL CDAT Data Object Exchange implementation
  hw/mem/cxl-type3: Add CXL CDAT Data Object Exchange

Igor Mammedov (11):
  acpi: pc: vga: use AcpiDevAmlIf interface to build VGA device descriptors
  tests: acpi: whitelist DSDT before generating PCI-ISA bridge AML 
automatically
  acpi: pc/q35: drop ad-hoc PCI-ISA bridge AML routines and let bus 
ennumeration generate AML
  tests: acpi: update expected DSDT after ISA bridge is moved directly 
under PCI host bridge
  tests: acpi: whitelist DSDT before generating ICH9_SMB AML automatically
  acpi: add get_dev_aml_func() helper
  acpi: enumerate SMB bridge automatically along with other PCI devices
  tests: acpi: update expected blobs
  tests: acpi: pc/q35 whitelist DSDT before \_GPE cleanup
  acpi: pc/35: sanitize _GPE declaration order
  tests: acpi: update expected blobs

Jason Wang (4):
  intel-iommu: don't warn guest errors when getting rid2pasid entry
  intel-iommu: drop VTDBus
  intel-iommu: convert VTD_PE_GET_FPD_ERR() to be a function
  intel-iommu: PASID support

Jonathan Cameron (2):
  hw/mem/cxl-type3: Add MSIX support
  hw/pci-bridge/cxl-upstream: Add a CDAT table access DOE

Julia Suvorova (5):
  hw/smbios: add core_count2 to smbios table type 4
  bios-tables-test: teach test to use smbios 3.0 tables
  tests/acpi: allow changes for core_count2 test
  bios-tables-test: add test for number of cores > 255
  tests/acpi: update tables for new core count test

Kangjie Xu (10):
  virtio: introduce virtio_queue_enable()
  virtio: core: vq reset feature negotation support
  virtio-pci: support queue enable
  vhost: expose vhost_virtqueue_start()
  vhost: expose vhost_virtqueue_stop()
  vhost-net: vhost-kernel: introduce vhost_net_virtqueue_reset()
  vhost-net: vhost-kernel: introduce vhost_net_virtqueue_restart()
  virtio-net: introduce flush_or_purge_queued_packets()

Re: [PATCH] linux-user: always translate cmsg when recvmsg

2022-11-02 Thread Laurent Vivier


Le 28/10/2022 à 10:12, Icenowy Zheng a écrit :

It's possible that a message contains both normal payload and ancillary
data in the same message, and even if no ancillary data is available
this information should be passed to the target, otherwise the target
cmsghdr will be left uninitialized and the target is going to access
uninitialized memory if it expects cmsg.

Always call the function that translate cmsg when recvmsg, because that
function should be empty-cmsg-safe (it creates an empty cmsg in the
target).

Signed-off-by: Icenowy Zheng 
---
  linux-user/syscall.c | 3 ++-
  1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/linux-user/syscall.c b/linux-user/syscall.c
index 8402c1399d..029a4e8b42 100644
--- a/linux-user/syscall.c
+++ b/linux-user/syscall.c
@@ -3346,7 +3346,8 @@ static abi_long do_sendrecvmsg_locked(int fd, struct 
target_msghdr *msgp,
  if (fd_trans_host_to_target_data(fd)) {
  ret = fd_trans_host_to_target_data(fd)(msg.msg_iov->iov_base,
 MIN(msg.msg_iov->iov_len, 
len));
-} else {
+}
+if (!is_error(ret)) {
  ret = host_to_target_cmsg(msgp, );
  }
  if (!is_error(ret)) {


Applied to my linux-user-for-7.2 branch.

Thanks,
Laurent

[PULL v2 25/82] tests/acpi: virt: update ACPI MADT and FADT binaries

2022-11-02 Thread Michael S. Tsirkin

From: Miguel Luis 

Step 6 & 7 of the bios-tables-test.c documented procedure.

Differences between disassembled ASL files for MADT:

@@ -11,9 +11,9 @@
  */

 [000h    4]Signature : "APIC"[Multiple APIC 
Description Table (MADT)]
-[004h 0004   4] Table Length : 00A8
-[008h 0008   1] Revision : 03
-[009h 0009   1] Checksum : 50
+[004h 0004   4] Table Length : 00AC
+[008h 0008   1] Revision : 04
+[009h 0009   1] Checksum : 47
 [00Ah 0010   6]   Oem ID : "BOCHS "
 [010h 0016   8] Oem Table ID : "BXPC"
 [018h 0024   4] Oem Revision : 0001
@@ -34,7 +34,7 @@
 [041h 0065   3] Reserved : 00

 [044h 0068   1]Subtable Type : 0B [Generic Interrupt 
Controller]
-[045h 0069   1]   Length : 4C
+[045h 0069   1]   Length : 50
 [046h 0070   2] Reserved : 
 [048h 0072   4] CPU Interface Number : 
 [04Ch 0076   4]Processor UID : 
@@ -51,28 +51,29 @@
 [07Ch 0124   4]Virtual GIC Interrupt : 
 [080h 0128   8]   Redistributor Base Address : 
 [088h 0136   8]ARM MPIDR : 
-/ ACPI subtable terminates early - may be older version (dump table) */
+[090h 0144   1] Efficiency Class : 00
+[091h 0145   3] Reserved : 00

-[090h 0144   1]Subtable Type : 0D [Generic MSI Frame]
-[091h 0145   1]   Length : 18
-[092h 0146   2] Reserved : 
-[094h 0148   4] MSI Frame ID : 
-[098h 0152   8] Base Address : 0802
-[0A0h 0160   4]Flags (decoded below) : 0001
+[094h 0148   1]Subtable Type : 0D [Generic MSI Frame]
+[095h 0149   1]   Length : 18
+[096h 0150   2] Reserved : 
+[098h 0152   4] MSI Frame ID : 
+[09Ch 0156   8] Base Address : 0802
+[0A4h 0164   4]Flags (decoded below) : 0001
   Select SPI : 1
-[0A4h 0164   2]SPI Count : 0040
-[0A6h 0166   2] SPI Base : 0050
+[0A8h 0168   2]SPI Count : 0040
+[0AAh 0170   2] SPI Base : 0050

-Raw Table Data: Length 168 (0xA8)
+Raw Table Data: Length 172 (0xAC)

-: 41 50 49 43 A8 00 00 00 03 50 42 4F 43 48 53 20  // APIC.PBOCHS
+: 41 50 49 43 AC 00 00 00 04 47 42 4F 43 48 53 20  // APIC.GBOCHS
 0010: 42 58 50 43 20 20 20 20 01 00 00 00 42 58 50 43  // BXPCBXPC
 0020: 01 00 00 00 00 00 00 00 00 00 00 00 0C 18 00 00  // 
 0030: 00 00 00 00 00 00 00 08 00 00 00 00 00 00 00 00  // 
-0040: 02 00 00 00 0B 4C 00 00 00 00 00 00 00 00 00 00  // .L..
+0040: 02 00 00 00 0B 50 00 00 00 00 00 00 00 00 00 00  // .P..
 0050: 01 00 00 00 00 00 00 00 17 00 00 00 00 00 00 00  // 
 0060: 00 00 00 00 00 00 01 08 00 00 00 00 00 00 04 08  // 
 0070: 00 00 00 00 00 00 03 08 00 00 00 00 00 00 00 00  // 
 0080: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  // 
-0090: 0D 18 00 00 00 00 00 00 00 00 02 08 00 00 00 00  // 
-00A0: 01 00 00 00 40 00 50 00  // @.P.
+0090: 00 00 00 00 0D 18 00 00 00 00 00 00 00 00 02 08  // 
+00A0: 00 00 00 00 01 00 00 00 40 00 50 00  // @.P.

Differences between disassembled ASL files for FADT:

@@ -11,9 +11,9 @@
  */

 [000h    4]Signature : "FACP"[Fixed ACPI 
Description Table (FADT)]
-[004h 0004   4] Table Length : 010C
-[008h 0008   1] Revision : 05
-[009h 0009   1] Checksum : 55
+[004h 0004   4] Table Length : 0114
+[008h 0008   1] Revision : 06
+[009h 0009   1] Checksum : 15
 [00Ah 0010   6]   Oem ID : "BOCHS "
 [010h 0016   8] Oem Table ID : "BXPC"
 [018h 0024   4] Oem Revision : 0001
@@ -99,7 +99,7 @@
   PSCI Compliant : 1
Must use HVC for PSCI : 1

-[083h 0131   1]  FADT Minor Revision : 01
+[083h 0131   1]  FADT Minor Revision : 00
 [084h 0132   8] FACS Address : 
 [08Ch 0140   8] DSDT Address : 
 [094h 0148  12] PM1A Event Block : [Generic Address Structure]
@@ -173,11 +173,11 @@
 [103h 0259   1] Encoded Access Width : 00 [Undefined/Legacy]
 [104h 0260   8]

Re: [PATCH] linux-user: Add strace output for timer_settime64() syscall

2022-11-02 Thread Laurent Vivier


Le 24/10/2022 à 22:45, Helge Deller a écrit :

Add missing timer_settime64() strace output and specify format for
timer_settime().

Signed-off-by: Helge Deller 

diff --git a/linux-user/strace.list b/linux-user/strace.list
index cd995e5d56..3a898e2532 100644
--- a/linux-user/strace.list
+++ b/linux-user/strace.list
@@ -1534,7 +1534,10 @@
  { TARGET_NR_timer_gettime, "timer_gettime" , NULL, NULL, NULL },
  #endif
  #ifdef TARGET_NR_timer_settime
-{ TARGET_NR_timer_settime, "timer_settime" , NULL, NULL, NULL },
+{ TARGET_NR_timer_settime, "timer_settime" , "%s(%d,%d,%p,%p)", NULL, NULL },
+#endif
+#ifdef TARGET_NR_timer_settime64
+{ TARGET_NR_timer_settime64, "timer_settime64" , "%s(%d,%d,%p,%p)", NULL, NULL 
},
  #endif
  #ifdef TARGET_NR_timerfd
  { TARGET_NR_timerfd, "timerfd" , NULL, NULL, NULL },



Applied to my linux-user-for-7.2 branch.

Thanks,
Laurent

[PULL v2 23/82] acpi: fadt: support revision 6.0 of the ACPI specification

2022-11-02 Thread Michael S. Tsirkin

From: Miguel Luis 

Update the Fixed ACPI Description Table (FADT) to revision 6.0 of the ACPI
specification adding the field "Hypervisor Vendor Identity".

This field's description states the following: "64-bit identifier of hypervisor
vendor. All bytes in this field are considered part of the vendor identity.
These identifiers are defined independently by the vendors themselves,
usually following the name of the hypervisor product. Version information
should NOT be included in this field - this shall simply denote the vendor's
name or identifier. Version information can be communicated through a
supplemental vendor-specific hypervisor API. Firmware implementers would
place zero bytes into this field, denoting that no hypervisor is present in
the actual firmware."

Signed-off-by: Miguel Luis 
Reviewed-by: Ani Sinha 
Message-Id: <20221011181730.10885-3-miguel.l...@oracle.com>
Reviewed-by: Michael S. Tsirkin 
Signed-off-by: Michael S. Tsirkin 
---
 hw/acpi/aml-build.c  | 13 ++---
 hw/arm/virt-acpi-build.c | 10 +-
 2 files changed, 15 insertions(+), 8 deletions(-)

diff --git a/hw/acpi/aml-build.c b/hw/acpi/aml-build.c
index e6bfac95c7..42feb4d4d7 100644
--- a/hw/acpi/aml-build.c
+++ b/hw/acpi/aml-build.c
@@ -2070,7 +2070,7 @@ void build_pptt(GArray *table_data, BIOSLinker *linker, 
MachineState *ms,
 acpi_table_end(linker, );
 }
 
-/* build rev1/rev3/rev5.1 FADT */
+/* build rev1/rev3/rev5.1/rev6.0 FADT */
 void build_fadt(GArray *tbl, BIOSLinker *linker, const AcpiFadtData *f,
 const char *oem_id, const char *oem_table_id)
 {
@@ -2193,8 +2193,15 @@ void build_fadt(GArray *tbl, BIOSLinker *linker, const 
AcpiFadtData *f,
 /* SLEEP_STATUS_REG */
 build_append_gas_from_struct(tbl, >sleep_sts);
 
-/* TODO: extra fields need to be added to support revisions above rev5 */
-assert(f->rev == 5);
+if (f->rev == 5) {
+goto done;
+}
+
+/* Hypervisor Vendor Identity */
+build_append_padded_str(tbl, "QEMU", 8, '\0');
+
+/* TODO: extra fields need to be added to support revisions above rev6 */
+assert(f->rev == 6);
 
 done:
 acpi_table_end(linker, );
diff --git a/hw/arm/virt-acpi-build.c b/hw/arm/virt-acpi-build.c
index 13c6e3e468..e5377744f3 100644
--- a/hw/arm/virt-acpi-build.c
+++ b/hw/arm/virt-acpi-build.c
@@ -808,13 +808,13 @@ build_madt(GArray *table_data, BIOSLinker *linker, 
VirtMachineState *vms)
 }
 
 /* FADT */
-static void build_fadt_rev5(GArray *table_data, BIOSLinker *linker,
+static void build_fadt_rev6(GArray *table_data, BIOSLinker *linker,
 VirtMachineState *vms, unsigned dsdt_tbl_offset)
 {
-/* ACPI v5.1 */
+/* ACPI v6.0 */
 AcpiFadtData fadt = {
-.rev = 5,
-.minor_ver = 1,
+.rev = 6,
+.minor_ver = 0,
 .flags = 1 << ACPI_FADT_F_HW_REDUCED_ACPI,
 .xdsdt_tbl_offset = _tbl_offset,
 };
@@ -944,7 +944,7 @@ void virt_acpi_build(VirtMachineState *vms, AcpiBuildTables 
*tables)
 
 /* FADT MADT PPTT GTDT MCFG SPCR DBG2 pointed to by RSDT */
 acpi_add_table(table_offsets, tables_blob);
-build_fadt_rev5(tables_blob, tables->linker, vms, dsdt);
+build_fadt_rev6(tables_blob, tables->linker, vms, dsdt);
 
 acpi_add_table(table_offsets, tables_blob);
 build_madt(tables_blob, tables->linker, vms);
-- 
MST

[PULL v2 46/82] virtio-net: support queue reset

2022-11-02 Thread Michael S. Tsirkin

From: Xuan Zhuo 

virtio-net and vhost-kernel implement queue reset.
Queued packets in the corresponding queue pair are flushed
or purged.

For virtio-net, userspace datapath will be disabled later in
__virtio_queue_reset(). It will set addr of vring to 0 and idx to 0.
Thus, virtio_net_receive() and virtio_net_flush_tx() will not receive
or send packets.

For vhost-net, the datapath will be disabled in vhost_net_virtqueue_reset().

Signed-off-by: Xuan Zhuo 
Signed-off-by: Kangjie Xu 
Acked-by: Jason Wang 
Message-Id: <20221017092558.111082-13-xuanz...@linux.alibaba.com>
Reviewed-by: Michael S. Tsirkin 
Signed-off-by: Michael S. Tsirkin 
---
 hw/net/virtio-net.c | 18 ++
 1 file changed, 18 insertions(+)

diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c
index 038a6fba7c..34fb4b1423 100644
--- a/hw/net/virtio-net.c
+++ b/hw/net/virtio-net.c
@@ -546,6 +546,23 @@ static RxFilterInfo 
*virtio_net_query_rxfilter(NetClientState *nc)
 return info;
 }
 
+static void virtio_net_queue_reset(VirtIODevice *vdev, uint32_t queue_index)
+{
+VirtIONet *n = VIRTIO_NET(vdev);
+NetClientState *nc = qemu_get_subqueue(n->nic, vq2q(queue_index));
+
+if (!nc->peer) {
+return;
+}
+
+if (get_vhost_net(nc->peer) &&
+nc->peer->info->type == NET_CLIENT_DRIVER_TAP) {
+vhost_net_virtqueue_reset(vdev, nc, queue_index);
+}
+
+flush_or_purge_queued_packets(nc);
+}
+
 static void virtio_net_reset(VirtIODevice *vdev)
 {
 VirtIONet *n = VIRTIO_NET(vdev);
@@ -3827,6 +3844,7 @@ static void virtio_net_class_init(ObjectClass *klass, 
void *data)
 vdc->set_features = virtio_net_set_features;
 vdc->bad_features = virtio_net_bad_features;
 vdc->reset = virtio_net_reset;
+vdc->queue_reset = virtio_net_queue_reset;
 vdc->set_status = virtio_net_set_status;
 vdc->guest_notifier_mask = virtio_net_guest_notifier_mask;
 vdc->guest_notifier_pending = virtio_net_guest_notifier_pending;
-- 
MST

[PULL v2 38/82] virtio: core: vq reset feature negotation support

2022-11-02 Thread Michael S. Tsirkin

From: Kangjie Xu 

A a new command line parameter "queue_reset" is added.

Meanwhile, the vq reset feature is disabled for pre-7.2 machines.

Signed-off-by: Kangjie Xu 
Signed-off-by: Xuan Zhuo 
Acked-by: Jason Wang 
Message-Id: <20221017092558.111082-5-xuanz...@linux.alibaba.com>
Reviewed-by: Michael S. Tsirkin 
Signed-off-by: Michael S. Tsirkin 
---
 include/hw/virtio/virtio.h | 4 +++-
 hw/core/machine.c  | 4 +++-
 2 files changed, 6 insertions(+), 2 deletions(-)

diff --git a/include/hw/virtio/virtio.h b/include/hw/virtio/virtio.h
index 5cd7861aeb..18a8920cc0 100644
--- a/include/hw/virtio/virtio.h
+++ b/include/hw/virtio/virtio.h
@@ -313,7 +313,9 @@ typedef struct VirtIORNGConf VirtIORNGConf;
 DEFINE_PROP_BIT64("iommu_platform", _state, _field, \
   VIRTIO_F_IOMMU_PLATFORM, false), \
 DEFINE_PROP_BIT64("packed", _state, _field, \
-  VIRTIO_F_RING_PACKED, false)
+  VIRTIO_F_RING_PACKED, false), \
+DEFINE_PROP_BIT64("queue_reset", _state, _field, \
+  VIRTIO_F_RING_RESET, true)
 
 hwaddr virtio_queue_get_desc_addr(VirtIODevice *vdev, int n);
 bool virtio_queue_enabled_legacy(VirtIODevice *vdev, int n);
diff --git a/hw/core/machine.c b/hw/core/machine.c
index aa520e74a8..907fa78ff0 100644
--- a/hw/core/machine.c
+++ b/hw/core/machine.c
@@ -40,7 +40,9 @@
 #include "hw/virtio/virtio-pci.h"
 #include "qom/object_interfaces.h"
 
-GlobalProperty hw_compat_7_1[] = {};
+GlobalProperty hw_compat_7_1[] = {
+{ "virtio-device", "queue_reset", "false" },
+};
 const size_t hw_compat_7_1_len = G_N_ELEMENTS(hw_compat_7_1);
 
 GlobalProperty hw_compat_7_0[] = {
-- 
MST

[PULL v2 43/82] vhost-net: vhost-kernel: introduce vhost_net_virtqueue_reset()

2022-11-02 Thread Michael S. Tsirkin

From: Kangjie Xu 

Introduce vhost_virtqueue_reset(), which can reset the specific
virtqueue in the device. Then it will unmap vrings and the desc
of the virtqueue.

Here we do not reuse the vhost_net_stop_one() or vhost_dev_stop(),
because they work at queue pair level. We do not use
vhost_virtqueue_stop() because it may stop the device in the
backend.

This patch only considers the case of vhost-kernel, when
NetClientDriver is NET_CLIENT_DRIVER_TAP.

Furthermore, we do not need net->nc->info->poll() because
it enables userspace datapath and we want to stop all
datapaths for this reset virtqueue here.

Signed-off-by: Kangjie Xu 
Signed-off-by: Xuan Zhuo 
Acked-by: Jason Wang 
Message-Id: <20221017092558.111082-10-xuanz...@linux.alibaba.com>
Reviewed-by: Michael S. Tsirkin 
Signed-off-by: Michael S. Tsirkin 
---
 include/net/vhost_net.h |  2 ++
 hw/net/vhost_net-stub.c |  6 ++
 hw/net/vhost_net.c  | 25 +
 3 files changed, 33 insertions(+)

diff --git a/include/net/vhost_net.h b/include/net/vhost_net.h
index 387e913e4e..85d85a4957 100644
--- a/include/net/vhost_net.h
+++ b/include/net/vhost_net.h
@@ -48,4 +48,6 @@ uint64_t vhost_net_get_acked_features(VHostNetState *net);
 
 int vhost_net_set_mtu(struct vhost_net *net, uint16_t mtu);
 
+void vhost_net_virtqueue_reset(VirtIODevice *vdev, NetClientState *nc,
+   int vq_index);
 #endif
diff --git a/hw/net/vhost_net-stub.c b/hw/net/vhost_net-stub.c
index 89d71cfb8e..2d745e359c 100644
--- a/hw/net/vhost_net-stub.c
+++ b/hw/net/vhost_net-stub.c
@@ -101,3 +101,9 @@ int vhost_net_set_mtu(struct vhost_net *net, uint16_t mtu)
 {
 return 0;
 }
+
+void vhost_net_virtqueue_reset(VirtIODevice *vdev, NetClientState *nc,
+   int vq_index)
+{
+
+}
diff --git a/hw/net/vhost_net.c b/hw/net/vhost_net.c
index d6924f5e57..519dced899 100644
--- a/hw/net/vhost_net.c
+++ b/hw/net/vhost_net.c
@@ -531,3 +531,28 @@ int vhost_net_set_mtu(struct vhost_net *net, uint16_t mtu)
 
 return vhost_ops->vhost_net_set_mtu(>dev, mtu);
 }
+
+void vhost_net_virtqueue_reset(VirtIODevice *vdev, NetClientState *nc,
+   int vq_index)
+{
+VHostNetState *net = get_vhost_net(nc->peer);
+const VhostOps *vhost_ops = net->dev.vhost_ops;
+struct vhost_vring_file file = { .fd = -1 };
+int idx;
+
+/* should only be called after backend is connected */
+assert(vhost_ops);
+
+idx = vhost_ops->vhost_get_vq_index(>dev, vq_index);
+
+if (net->nc->info->type == NET_CLIENT_DRIVER_TAP) {
+file.index = idx;
+int r = vhost_net_set_backend(>dev, );
+assert(r >= 0);
+}
+
+vhost_virtqueue_stop(>dev,
+ vdev,
+ net->dev.vqs + idx,
+ net->dev.vq_index + idx);
+}
-- 
MST

[PULL v2 67/82] hw/i386/acpi-build: Remove unused struct

2022-11-02 Thread Michael S. Tsirkin

From: Bernhard Beschow 

Ammends commit b23046abe78f48498a423b802d6d86ba0172d57f 'pc: acpi-build:
simplify PCI bus tree generation'.

Signed-off-by: Bernhard Beschow 
Reviewed-by: Philippe Mathieu-Daudé 
Message-Id: <20221026133110.91828-2-shen...@gmail.com>
Message-Id: <20221028103419.93398-2-shen...@gmail.com>
Reviewed-by: Michael S. Tsirkin 
Signed-off-by: Michael S. Tsirkin 
---
 hw/i386/acpi-build.c | 7 ---
 1 file changed, 7 deletions(-)

diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c
index 960305462c..1ebf14b899 100644
--- a/hw/i386/acpi-build.c
+++ b/hw/i386/acpi-build.c
@@ -121,13 +121,6 @@ typedef struct AcpiMiscInfo {
 unsigned dsdt_size;
 } AcpiMiscInfo;
 
-typedef struct AcpiBuildPciBusHotplugState {
-GArray *device_table;
-GArray *notify_table;
-struct AcpiBuildPciBusHotplugState *parent;
-bool pcihp_bridge_en;
-} AcpiBuildPciBusHotplugState;
-
 typedef struct FwCfgTPMConfig {
 uint32_t tpmppi_address;
 uint8_t tpm_version;
-- 
MST

[PULL v2 07/82] virtio-crypto: Support asynchronous mode

2022-11-02 Thread Michael S. Tsirkin

From: Lei He 

virtio-crypto: Modify the current interface of virtio-crypto
device to support asynchronous mode.

Signed-off-by: lei he 
Message-Id: <20221008085030.70212-2-helei.si...@bytedance.com>
Reviewed-by: Michael S. Tsirkin 
Signed-off-by: Michael S. Tsirkin 
---
 include/sysemu/cryptodev.h  |  60 --
 backends/cryptodev-builtin.c|  69 +--
 backends/cryptodev-vhost-user.c |  53 +++--
 backends/cryptodev.c|  44 +++--
 hw/virtio/virtio-crypto.c   | 339 ++--
 5 files changed, 347 insertions(+), 218 deletions(-)

diff --git a/include/sysemu/cryptodev.h b/include/sysemu/cryptodev.h
index 37c3a360fd..32e9f4cf8a 100644
--- a/include/sysemu/cryptodev.h
+++ b/include/sysemu/cryptodev.h
@@ -113,6 +113,7 @@ typedef struct CryptoDevBackendSessionInfo {
 CryptoDevBackendSymSessionInfo sym_sess_info;
 CryptoDevBackendAsymSessionInfo asym_sess_info;
 } u;
+uint64_t session_id;
 } CryptoDevBackendSessionInfo;
 
 /**
@@ -188,21 +189,30 @@ typedef struct CryptoDevBackendOpInfo {
 } u;
 } CryptoDevBackendOpInfo;
 
+typedef void (*CryptoDevCompletionFunc) (void *opaque, int ret);
 struct CryptoDevBackendClass {
 ObjectClass parent_class;
 
 void (*init)(CryptoDevBackend *backend, Error **errp);
 void (*cleanup)(CryptoDevBackend *backend, Error **errp);
 
-int64_t (*create_session)(CryptoDevBackend *backend,
-   CryptoDevBackendSessionInfo *sess_info,
-   uint32_t queue_index, Error **errp);
+int (*create_session)(CryptoDevBackend *backend,
+  CryptoDevBackendSessionInfo *sess_info,
+  uint32_t queue_index,
+  CryptoDevCompletionFunc cb,
+  void *opaque);
+
 int (*close_session)(CryptoDevBackend *backend,
-   uint64_t session_id,
-   uint32_t queue_index, Error **errp);
+ uint64_t session_id,
+ uint32_t queue_index,
+ CryptoDevCompletionFunc cb,
+ void *opaque);
+
 int (*do_op)(CryptoDevBackend *backend,
- CryptoDevBackendOpInfo *op_info,
- uint32_t queue_index, Error **errp);
+ CryptoDevBackendOpInfo *op_info,
+ uint32_t queue_index,
+ CryptoDevCompletionFunc cb,
+ void *opaque);
 };
 
 typedef enum CryptoDevBackendOptionsType {
@@ -303,15 +313,20 @@ void cryptodev_backend_cleanup(
  * @sess_info: parameters needed by session creating
  * @queue_index: queue index of cryptodev backend client
  * @errp: pointer to a NULL-initialized error object
+ * @cb: callback when session create is compeleted
+ * @opaque: parameter passed to callback
  *
- * Create a session for symmetric/symmetric algorithms
+ * Create a session for symmetric/asymmetric algorithms
  *
- * Returns: session id on success, or -1 on error
+ * Returns: 0 for success and cb will be called when creation is completed,
+ * negative value for error, and cb will not be called.
  */
-int64_t cryptodev_backend_create_session(
+int cryptodev_backend_create_session(
CryptoDevBackend *backend,
CryptoDevBackendSessionInfo *sess_info,
-   uint32_t queue_index, Error **errp);
+   uint32_t queue_index,
+   CryptoDevCompletionFunc cb,
+   void *opaque);
 
 /**
  * cryptodev_backend_close_session:
@@ -319,34 +334,43 @@ int64_t cryptodev_backend_create_session(
  * @session_id: the session id
  * @queue_index: queue index of cryptodev backend client
  * @errp: pointer to a NULL-initialized error object
+ * @cb: callback when session create is compeleted
+ * @opaque: parameter passed to callback
  *
  * Close a session for which was previously
  * created by cryptodev_backend_create_session()
  *
- * Returns: 0 on success, or Negative on error
+ * Returns: 0 for success and cb will be called when creation is completed,
+ * negative value for error, and cb will not be called.
  */
 int cryptodev_backend_close_session(
CryptoDevBackend *backend,
uint64_t session_id,
-   uint32_t queue_index, Error **errp);
+   uint32_t queue_index,
+   CryptoDevCompletionFunc cb,
+   void *opaque);
 
 /**
  * cryptodev_backend_crypto_operation:
  * @backend: the cryptodev backend object
- * @opaque: pointer to a VirtIOCryptoReq object
+ * @opaque1: pointer to a VirtIOCryptoReq object
  * @queue_index: queue index of cryptodev backend client
  * @errp: pointer to a NULL-initialized error object
+ * @cb: callbacks when operation is completed
+ * @opaque2: parameter passed to cb
  *
  * Do crypto operation, such as encryption and
  * decryption
  *
- * Returns: VIRTIO_CRYPTO_OK on success,
- * or -VIRTIO_CRYPTO_* on error
+ * Returns: 0 for success and cb will be called when creation

[PULL v2 31/82] vhost: Change the sequence of device start

2022-11-02 Thread Michael S. Tsirkin

From: Yajun Wu 

This patch is part of adding vhost-user vhost_dev_start support. The
motivation is to improve backend configuration speed and reduce live
migration VM downtime.

Moving the device start routines after finishing all the necessary device
and VQ configuration, further aligning to the virtio specification for
"device initialization sequence".

Following patch will add vhost-user vhost_dev_start support.

Signed-off-by: Yajun Wu 
Acked-by: Parav Pandit 

Message-Id: <20221017064452.1226514-2-yaj...@nvidia.com>
Reviewed-by: Michael S. Tsirkin 
Signed-off-by: Michael S. Tsirkin 
---
 hw/block/vhost-user-blk.c | 18 +++---
 hw/net/vhost_net.c| 12 ++--
 2 files changed, 17 insertions(+), 13 deletions(-)

diff --git a/hw/block/vhost-user-blk.c b/hw/block/vhost-user-blk.c
index 13bf5cc47a..28409c90f7 100644
--- a/hw/block/vhost-user-blk.c
+++ b/hw/block/vhost-user-blk.c
@@ -168,13 +168,6 @@ static int vhost_user_blk_start(VirtIODevice *vdev, Error 
**errp)
 goto err_guest_notifiers;
 }
 
-ret = vhost_dev_start(>dev, vdev);
-if (ret < 0) {
-error_setg_errno(errp, -ret, "Error starting vhost");
-goto err_guest_notifiers;
-}
-s->started_vu = true;
-
 /* guest_notifier_mask/pending not used yet, so just unmask
  * everything here. virtio-pci will do the right thing by
  * enabling/disabling irqfd.
@@ -183,9 +176,20 @@ static int vhost_user_blk_start(VirtIODevice *vdev, Error 
**errp)
 vhost_virtqueue_mask(>dev, vdev, i, false);
 }
 
+s->dev.vq_index_end = s->dev.nvqs;
+ret = vhost_dev_start(>dev, vdev);
+if (ret < 0) {
+error_setg_errno(errp, -ret, "Error starting vhost");
+goto err_guest_notifiers;
+}
+s->started_vu = true;
+
 return ret;
 
 err_guest_notifiers:
+for (i = 0; i < s->dev.nvqs; i++) {
+vhost_virtqueue_mask(>dev, vdev, i, true);
+}
 k->set_guest_notifiers(qbus->parent, s->dev.nvqs, false);
 err_host_notifiers:
 vhost_dev_disable_notifiers(>dev, vdev);
diff --git a/hw/net/vhost_net.c b/hw/net/vhost_net.c
index d28f8b974b..d6924f5e57 100644
--- a/hw/net/vhost_net.c
+++ b/hw/net/vhost_net.c
@@ -387,21 +387,21 @@ int vhost_net_start(VirtIODevice *dev, NetClientState 
*ncs,
 } else {
 peer = qemu_get_peer(ncs, n->max_queue_pairs);
 }
-r = vhost_net_start_one(get_vhost_net(peer), dev);
-
-if (r < 0) {
-goto err_start;
-}
 
 if (peer->vring_enable) {
 /* restore vring enable state */
 r = vhost_set_vring_enable(peer, peer->vring_enable);
 
 if (r < 0) {
-vhost_net_stop_one(get_vhost_net(peer), dev);
 goto err_start;
 }
 }
+
+r = vhost_net_start_one(get_vhost_net(peer), dev);
+if (r < 0) {
+vhost_net_stop_one(get_vhost_net(peer), dev);
+goto err_start;
+}
 }
 
 return 0;
-- 
MST

[PULL v2 49/82] virtio-net: enable vq reset feature

2022-11-02 Thread Michael S. Tsirkin

From: Xuan Zhuo 

Add virtqueue reset feature for virtio-net

Signed-off-by: Xuan Zhuo 
Message-Id: <20221017092558.111082-16-xuanz...@linux.alibaba.com>
Reviewed-by: Michael S. Tsirkin 
Signed-off-by: Michael S. Tsirkin 
---
 hw/net/virtio-net.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c
index e68daf51bb..8b32339b76 100644
--- a/hw/net/virtio-net.c
+++ b/hw/net/virtio-net.c
@@ -788,6 +788,7 @@ static uint64_t virtio_net_get_features(VirtIODevice *vdev, 
uint64_t features,
 }
 
 if (!get_vhost_net(nc->peer)) {
+virtio_add_feature(, VIRTIO_F_RING_RESET);
 return features;
 }
 
-- 
MST

[PULL v2 79/82] intel-iommu: don't warn guest errors when getting rid2pasid entry

2022-11-02 Thread Michael S. Tsirkin

From: Jason Wang 

We use to warn on wrong rid2pasid entry. But this error could be
triggered by the guest and could happens during initialization. So
let's don't warn in this case.

Reviewed-by: Peter Xu 
Signed-off-by: Jason Wang 
Message-Id: <20221028061436.30093-2-jasow...@redhat.com>
Reviewed-by: Michael S. Tsirkin 
Signed-off-by: Michael S. Tsirkin 
Reviewed-by: Yi Liu 
---
 hw/i386/intel_iommu.c | 12 ++--
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c
index 6524c2ee32..271de995be 100644
--- a/hw/i386/intel_iommu.c
+++ b/hw/i386/intel_iommu.c
@@ -1554,8 +1554,10 @@ static bool vtd_dev_pt_enabled(IntelIOMMUState *s, 
VTDContextEntry *ce)
 if (s->root_scalable) {
 ret = vtd_ce_get_rid2pasid_entry(s, ce, );
 if (ret) {
-error_report_once("%s: vtd_ce_get_rid2pasid_entry error: %"PRId32,
-  __func__, ret);
+/*
+ * This error is guest triggerable. We should assumt PT
+ * not enabled for safety.
+ */
 return false;
 }
 return (VTD_PE_GET_TYPE() == VTD_SM_PASID_ENTRY_PT);
@@ -1569,14 +1571,12 @@ static bool vtd_as_pt_enabled(VTDAddressSpace *as)
 {
 IntelIOMMUState *s;
 VTDContextEntry ce;
-int ret;
 
 assert(as);
 
 s = as->iommu_state;
-ret = vtd_dev_to_context_entry(s, pci_bus_num(as->bus),
-   as->devfn, );
-if (ret) {
+if (vtd_dev_to_context_entry(s, pci_bus_num(as->bus), as->devfn,
+ )) {
 /*
  * Possibly failed to parse the context entry for some reason
  * (e.g., during init, or any guest configuration errors on
-- 
MST

[PULL v2 58/82] acpi: enumerate SMB bridge automatically along with other PCI devices

2022-11-02 Thread Michael S. Tsirkin

From: Igor Mammedov 

to make that happen (bridge sits at _ADR: 0x001F0003),
relax PCI enumeration logic to include devices with *function* > 0
if device has something to say about itself (i.e. has build_dev_aml
callback set).

Signed-off-by: Igor Mammedov 
Message-Id: <20221017102146.2254096-8-imamm...@redhat.com>
Reviewed-by: Michael S. Tsirkin 
Signed-off-by: Michael S. Tsirkin 
---
 hw/i386/acpi-build.c | 27 +++
 1 file changed, 3 insertions(+), 24 deletions(-)

diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c
index e1483bb11a..916343d8d6 100644
--- a/hw/i386/acpi-build.c
+++ b/hw/i386/acpi-build.c
@@ -448,9 +448,10 @@ static void build_append_pci_bus_devices(Aml 
*parent_scope, PCIBus *bus,
 /*
  * allow describing coldplugged bridges in ACPI even if they are 
not
  * on function 0, as they are not unpluggable, for all other 
devices
- * generate description only for function 0 per slot
+ * generate description only for function 0 per slot, and for other
+ * functions if device on function provides its own AML
  */
-if (func && !bridge_in_acpi) {
+if (func && !bridge_in_acpi && !get_dev_aml_func(DEVICE(pdev))) {
 continue;
 }
 } else {
@@ -1319,25 +1320,6 @@ static Aml *build_q35_osc_method(bool 
enable_native_pcie_hotplug)
 return method;
 }
 
-static void build_smb0(Aml *table, int devnr, int func)
-{
-Aml *scope = aml_scope("_SB.PCI0");
-Aml *dev = aml_device("SMB0");
-bool ambiguous;
-Object *obj;
-/*
- * temporarily fish out device hosting SMBUS, build_smb0 will be gone once
- * PCI enumeration will be switched to call_dev_aml_func()
- */
-obj = object_resolve_path_type("", TYPE_ICH9_SMB_DEVICE, );
-assert(obj && !ambiguous);
-
-aml_append(dev, aml_name_decl("_ADR", aml_int(devnr << 16 | func)));
-call_dev_aml_func(DEVICE(obj), dev);
-aml_append(scope, dev);
-aml_append(table, scope);
-}
-
 static void build_acpi0017(Aml *table)
 {
 Aml *dev, *scope, *method;
@@ -1440,9 +1422,6 @@ build_dsdt(GArray *table_data, BIOSLinker *linker,
 build_x86_acpi_pci_hotplug(dsdt, pm->pcihp_io_base);
 }
 build_q35_pci0_int(dsdt);
-if (pcms->smbus) {
-build_smb0(dsdt, ICH9_SMB_DEV, ICH9_SMB_FUNC);
-}
 }
 
 if (misc->has_hpet) {
-- 
MST

[PULL v2 20/82] bios-tables-test: add test for number of cores > 255

2022-11-02 Thread Michael S. Tsirkin

From: Julia Suvorova 

The new test is run with a large number of cpus and checks if the
core_count field in smbios_cpu_test (structure type 4) is correct.

Choose q35 as it allows to run with -smp > 255.

Signed-off-by: Julia Suvorova 
Message-Id: <20220731162141.178443-5-jus...@redhat.com>
Message-Id: <2022101731.101412-5-jus...@redhat.com>
Reviewed-by: Michael S. Tsirkin 
Signed-off-by: Michael S. Tsirkin 
Reviewed-by: Igor Mammedov 
---
 tests/qtest/bios-tables-test.c | 58 ++
 1 file changed, 45 insertions(+), 13 deletions(-)

diff --git a/tests/qtest/bios-tables-test.c b/tests/qtest/bios-tables-test.c
index 0db6630772..ee6b1b483d 100644
--- a/tests/qtest/bios-tables-test.c
+++ b/tests/qtest/bios-tables-test.c
@@ -92,6 +92,8 @@ typedef struct {
 SmbiosEntryPoint smbios_ep_table;
 uint16_t smbios_cpu_max_speed;
 uint16_t smbios_cpu_curr_speed;
+uint8_t smbios_core_count;
+uint16_t smbios_core_count2;
 uint8_t *required_struct_types;
 int required_struct_types_len;
 QTestState *qts;
@@ -631,29 +633,42 @@ static inline bool smbios_single_instance(uint8_t type)
 }
 }
 
-static bool smbios_cpu_test(test_data *data, uint32_t addr)
+static void smbios_cpu_test(test_data *data, uint32_t addr,
+SmbiosEntryPointType ep_type)
 {
-uint16_t expect_speed[2];
-uint16_t real;
+uint8_t core_count, expected_core_count = data->smbios_core_count;
+uint16_t speed, expected_speed[2];
+uint16_t core_count2, expected_core_count2 = data->smbios_core_count2;
 int offset[2];
 int i;
 
 /* Check CPU speed for backward compatibility */
 offset[0] = offsetof(struct smbios_type_4, max_speed);
 offset[1] = offsetof(struct smbios_type_4, current_speed);
-expect_speed[0] = data->smbios_cpu_max_speed ? : 2000;
-expect_speed[1] = data->smbios_cpu_curr_speed ? : 2000;
+expected_speed[0] = data->smbios_cpu_max_speed ? : 2000;
+expected_speed[1] = data->smbios_cpu_curr_speed ? : 2000;
 
 for (i = 0; i < 2; i++) {
-real = qtest_readw(data->qts, addr + offset[i]);
-if (real != expect_speed[i]) {
-fprintf(stderr, "Unexpected SMBIOS CPU speed: real %u expect %u\n",
-real, expect_speed[i]);
-return false;
-}
+speed = qtest_readw(data->qts, addr + offset[i]);
+g_assert_cmpuint(speed, ==, expected_speed[i]);
 }
 
-return true;
+core_count = qtest_readb(data->qts,
+addr + offsetof(struct smbios_type_4, core_count));
+
+if (expected_core_count) {
+g_assert_cmpuint(core_count, ==, expected_core_count);
+}
+
+if (ep_type == SMBIOS_ENTRY_POINT_TYPE_64) {
+core_count2 = qtest_readw(data->qts,
+  addr + offsetof(struct smbios_type_4, core_count2));
+
+/* Core Count has reached its limit, checking Core Count 2 */
+if (expected_core_count == 0xFF && expected_core_count2) {
+g_assert_cmpuint(core_count2, ==, expected_core_count2);
+}
+}
 }
 
 static void test_smbios_structs(test_data *data, SmbiosEntryPointType ep_type)
@@ -686,7 +701,7 @@ static void test_smbios_structs(test_data *data, 
SmbiosEntryPointType ep_type)
 set_bit(type, struct_bitmap);
 
 if (type == 4) {
-g_assert(smbios_cpu_test(data, addr));
+smbios_cpu_test(data, addr, ep_type);
 }
 
 /* seek to end of unformatted string area of this struct ("\0\0") */
@@ -908,6 +923,21 @@ static void test_acpi_q35_tcg(void)
 free_test_data();
 }
 
+static void test_acpi_q35_tcg_core_count2(void)
+{
+test_data data = {
+.machine = MACHINE_Q35,
+.variant = ".core-count2",
+.required_struct_types = base_required_struct_types,
+.required_struct_types_len = ARRAY_SIZE(base_required_struct_types),
+.smbios_core_count = 0xFF,
+.smbios_core_count2 = 275,
+};
+
+test_acpi_one("-machine smbios-entry-point-type=64 -smp 275", );
+free_test_data();
+}
+
 static void test_acpi_q35_tcg_bridge(void)
 {
 test_data data;
@@ -1887,6 +1917,8 @@ int main(int argc, char *argv[])
 if (has_kvm) {
 qtest_add_func("acpi/q35/kvm/xapic", test_acpi_q35_kvm_xapic);
 qtest_add_func("acpi/q35/kvm/dmar", test_acpi_q35_kvm_dmar);
+qtest_add_func("acpi/q35/core-count2",
+   test_acpi_q35_tcg_core_count2);
 }
 qtest_add_func("acpi/q35/viot", test_acpi_q35_viot);
 #ifdef CONFIG_POSIX
-- 
MST

[PULL v2 26/82] hw/pci: PCIe Data Object Exchange emulation

2022-11-02 Thread Michael S. Tsirkin

From: Huai-Cheng Kuo 

Emulation of PCIe Data Object Exchange (DOE)
PCIE Base Specification r6.0 6.3 Data Object Exchange

Supports multiple DOE PCIe Extended Capabilities for a single PCIe
device. For each capability, a static array of DOEProtocol should be passed
to pcie_doe_init(). The protocols in that array will be registered under
the DOE capability structure. For each protocol, vendor ID, type, and
corresponding callback function (handle_request()) should be implemented.
This callback function represents how the DOE request for corresponding
protocol will be handled.

pcie_doe_{read/write}_config() must be appended to corresponding PCI
device's config_read/write() handler to enable DOE access. In
pcie_doe_read_config(), false will be returned if pci_config_read()
offset is not within DOE capability range. In pcie_doe_write_config(),
the function will have no affect if the address is not within the related
DOE PCIE extended capability.

Signed-off-by: Huai-Cheng Kuo 
Signed-off-by: Chris Browy 
Signed-off-by: Jonathan Cameron 
Message-Id: <20221014151045.24781-2-jonathan.came...@huawei.com>
Reviewed-by: Michael S. Tsirkin 
Signed-off-by: Michael S. Tsirkin 
---
 include/hw/pci/pci_ids.h   |   3 +
 include/hw/pci/pcie.h  |   1 +
 include/hw/pci/pcie_doe.h  | 123 +
 include/hw/pci/pcie_regs.h |   4 +
 hw/pci/pcie_doe.c  | 367 +
 MAINTAINERS|   7 +
 hw/pci/meson.build |   1 +
 7 files changed, 506 insertions(+)
 create mode 100644 include/hw/pci/pcie_doe.h
 create mode 100644 hw/pci/pcie_doe.c

diff --git a/include/hw/pci/pci_ids.h b/include/hw/pci/pci_ids.h
index d5ddea558b..bc9f834fd1 100644
--- a/include/hw/pci/pci_ids.h
+++ b/include/hw/pci/pci_ids.h
@@ -157,6 +157,9 @@
 
 /* Vendors and devices.  Sort key: vendor first, device next. */
 
+/* Ref: PCIe r6.0 Table 6-32 */
+#define PCI_VENDOR_ID_PCI_SIG0x0001
+
 #define PCI_VENDOR_ID_LSI_LOGIC  0x1000
 #define PCI_DEVICE_ID_LSI_53C810 0x0001
 #define PCI_DEVICE_ID_LSI_53C895A0x0012
diff --git a/include/hw/pci/pcie.h b/include/hw/pci/pcie.h
index 798a262a0a..698d3de851 100644
--- a/include/hw/pci/pcie.h
+++ b/include/hw/pci/pcie.h
@@ -26,6 +26,7 @@
 #include "hw/pci/pcie_aer.h"
 #include "hw/pci/pcie_sriov.h"
 #include "hw/hotplug.h"
+#include "hw/pci/pcie_doe.h"
 
 typedef enum {
 /* for attention and power indicator */
diff --git a/include/hw/pci/pcie_doe.h b/include/hw/pci/pcie_doe.h
new file mode 100644
index 00..ba4d8b03bd
--- /dev/null
+++ b/include/hw/pci/pcie_doe.h
@@ -0,0 +1,123 @@
+/*
+ * PCIe Data Object Exchange
+ *
+ * Copyright (C) 2021 Avery Design Systems, Inc.
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or later.
+ * See the COPYING file in the top-level directory.
+ */
+
+#ifndef PCIE_DOE_H
+#define PCIE_DOE_H
+
+#include "qemu/range.h"
+#include "qemu/typedefs.h"
+#include "hw/register.h"
+
+/*
+ * Reference:
+ * PCIe r6.0 - 7.9.24 Data Object Exchange Extended Capability
+ */
+/* Capabilities Register - r6.0 7.9.24.2 */
+#define PCI_EXP_DOE_CAP 0x04
+REG32(PCI_DOE_CAP_REG, 0)
+FIELD(PCI_DOE_CAP_REG, INTR_SUPP, 0, 1)
+FIELD(PCI_DOE_CAP_REG, DOE_INTR_MSG_NUM, 1, 11)
+
+/* Control Register - r6.0 7.9.24.3 */
+#define PCI_EXP_DOE_CTRL0x08
+REG32(PCI_DOE_CAP_CONTROL, 0)
+FIELD(PCI_DOE_CAP_CONTROL, DOE_ABORT, 0, 1)
+FIELD(PCI_DOE_CAP_CONTROL, DOE_INTR_EN, 1, 1)
+FIELD(PCI_DOE_CAP_CONTROL, DOE_GO, 31, 1)
+
+/* Status Register - r6.0 7.9.24.4 */
+#define PCI_EXP_DOE_STATUS  0x0c
+REG32(PCI_DOE_CAP_STATUS, 0)
+FIELD(PCI_DOE_CAP_STATUS, DOE_BUSY, 0, 1)
+FIELD(PCI_DOE_CAP_STATUS, DOE_INTR_STATUS, 1, 1)
+FIELD(PCI_DOE_CAP_STATUS, DOE_ERROR, 2, 1)
+FIELD(PCI_DOE_CAP_STATUS, DATA_OBJ_RDY, 31, 1)
+
+/* Write Data Mailbox Register - r6.0 7.9.24.5 */
+#define PCI_EXP_DOE_WR_DATA_MBOX0x10
+
+/* Read Data Mailbox Register - 7.9.xx.6 */
+#define PCI_EXP_DOE_RD_DATA_MBOX0x14
+
+/* PCI-SIG defined Data Object Types - r6.0 Table 6-32 */
+#define PCI_SIG_DOE_DISCOVERY   0x00
+
+#define PCI_DOE_DW_SIZE_MAX (1 << 18)
+#define PCI_DOE_PROTOCOL_NUM_MAX256
+
+#define DATA_OBJ_BUILD_HEADER1(v, p)(((p) << 16) | (v))
+#define DATA_OBJ_LEN_MASK(len)  ((len) & (PCI_DOE_DW_SIZE_MAX - 1))
+
+typedef struct DOEHeader DOEHeader;
+typedef struct DOEProtocol DOEProtocol;
+typedef struct DOECap DOECap;
+
+struct DOEHeader {
+uint16_t vendor_id;
+uint8_t data_obj_type;
+uint8_t reserved;
+uint32_t length;
+} QEMU_PACKED;
+
+/* Protocol infos and rsp function callback */
+struct DOEProtocol {
+uint16_t vendor_id;
+uint8_t data_obj_type;
+bool (*handle_request)(DOECap *);
+};
+
+struct DOECap {
+/* Owner */
+PCIDevice *pdev;
+
+uint16_t offset;
+
+struct {
+bool intr;
+uint16_t vec;
+} cap;
+
+struct {
+bool abort;
+bool intr;
+

[PULL v2 19/82] tests/acpi: allow changes for core_count2 test

2022-11-02 Thread Michael S. Tsirkin

From: Julia Suvorova 

Signed-off-by: Julia Suvorova 
Message-Id: <20220731162141.178443-4-jus...@redhat.com>
Message-Id: <2022101731.101412-4-jus...@redhat.com>
Reviewed-by: Michael S. Tsirkin 
Signed-off-by: Michael S. Tsirkin 
Acked-by: Igor Mammedov 
---
 tests/qtest/bios-tables-test-allowed-diff.h | 3 +++
 tests/data/acpi/q35/APIC.core-count2| 0
 tests/data/acpi/q35/DSDT.core-count2| 0
 tests/data/acpi/q35/FACP.core-count2| 0
 4 files changed, 3 insertions(+)
 create mode 100644 tests/data/acpi/q35/APIC.core-count2
 create mode 100644 tests/data/acpi/q35/DSDT.core-count2
 create mode 100644 tests/data/acpi/q35/FACP.core-count2

diff --git a/tests/qtest/bios-tables-test-allowed-diff.h 
b/tests/qtest/bios-tables-test-allowed-diff.h
index dfb8523c8b..e81dc67a2e 100644
--- a/tests/qtest/bios-tables-test-allowed-diff.h
+++ b/tests/qtest/bios-tables-test-allowed-diff.h
@@ -1 +1,4 @@
 /* List of comma-separated changed AML files to ignore */
+"tests/data/acpi/q35/APIC.core-count2",
+"tests/data/acpi/q35/DSDT.core-count2",
+"tests/data/acpi/q35/FACP.core-count2",
diff --git a/tests/data/acpi/q35/APIC.core-count2 
b/tests/data/acpi/q35/APIC.core-count2
new file mode 100644
index 00..e69de29bb2
diff --git a/tests/data/acpi/q35/DSDT.core-count2 
b/tests/data/acpi/q35/DSDT.core-count2
new file mode 100644
index 00..e69de29bb2
diff --git a/tests/data/acpi/q35/FACP.core-count2 
b/tests/data/acpi/q35/FACP.core-count2
new file mode 100644
index 00..e69de29bb2
-- 
MST

Re: [PATCH] linux-user/hppa: Detect glibc ABORT_INSTRUCTION and EXCP_BREAK handler

2022-11-02 Thread Laurent Vivier


Le 27/10/2022 à 08:58, Helge Deller a écrit :

The glibc on the hppa platform uses the "iitlbp %r0,(%sr0, %r0)"
assembler instruction as ABORT_INSTRUCTION.
If this (in userspace context) illegal assembler statement is found,
dump the registers and report the failure to userspace the same way as
the Linux kernel on physical hardware.

For other illegal instructions report TARGET_ILL_ILLOPC instead of
TARGET_ILL_ILLOPN as si_code.

Additionally add the missing EXCP_BREAK exception handler which occurs
when the "break x,y" assembler instruction is executed and report
EXCP_ASSIST traps.

Signed-off-by: Helge Deller 

diff --git a/linux-user/hppa/cpu_loop.c b/linux-user/hppa/cpu_loop.c
index 98c51e9b8b..a42c34e549 100644
--- a/linux-user/hppa/cpu_loop.c
+++ b/linux-user/hppa/cpu_loop.c
@@ -196,15 +196,20 @@ void cpu_loop(CPUHPPAState *env)
  force_sig_fault(TARGET_SIGSEGV, TARGET_SEGV_MAPERR, env->iaoq_f);
  break;
  case EXCP_ILL:
-EXCP_DUMP(env, "qemu: got CPU exception 0x%x - aborting\n", 
trapnr);
-force_sig_fault(TARGET_SIGILL, TARGET_ILL_ILLOPN, env->iaoq_f);
+EXCP_DUMP(env, "qemu: EXCP_ILL exception %#x\n", trapnr);
+force_sig_fault(TARGET_SIGILL, TARGET_ILL_ILLOPC, env->iaoq_f);
  break;
  case EXCP_PRIV_OPR:
-EXCP_DUMP(env, "qemu: got CPU exception 0x%x - aborting\n", 
trapnr);
-force_sig_fault(TARGET_SIGILL, TARGET_ILL_PRVOPC, env->iaoq_f);
+/* check for glibc ABORT_INSTRUCTION "iitlbp %r0,(%sr0, %r0)" */
+EXCP_DUMP(env, "qemu: EXCP_PRIV_OPR exception %#x\n", trapnr);
+if (env->cr[CR_IIR] == 0x0400) {
+   force_sig_fault(TARGET_SIGILL, TARGET_ILL_ILLOPC, 
env->iaoq_f);
+} else {
+   force_sig_fault(TARGET_SIGILL, TARGET_ILL_PRVOPC, 
env->iaoq_f);
+}
  break;
  case EXCP_PRIV_REG:
-EXCP_DUMP(env, "qemu: got CPU exception 0x%x - aborting\n", 
trapnr);
+EXCP_DUMP(env, "qemu: EXCP_PRIV_REG exception %#x\n", trapnr);
  force_sig_fault(TARGET_SIGILL, TARGET_ILL_PRVREG, env->iaoq_f);
  break;
  case EXCP_OVERFLOW:
@@ -216,6 +221,10 @@ void cpu_loop(CPUHPPAState *env)
  case EXCP_ASSIST:
  force_sig_fault(TARGET_SIGFPE, 0, env->iaoq_f);
  break;
+case EXCP_BREAK:
+EXCP_DUMP(env, "qemu: EXCP_BREAK exception %#x\n", trapnr);
+force_sig_fault(TARGET_SIGTRAP, TARGET_TRAP_BRKPT, env->iaoq_f & 
~3);
+break;
  case EXCP_DEBUG:
  force_sig_fault(TARGET_SIGTRAP, TARGET_TRAP_BRKPT, env->iaoq_f);
  break;



Applied to my linux-user-for-7.2 branch.

Thanks,
Laurent

[PULL v2 47/82] virtio-net: support queue_enable

2022-11-02 Thread Michael S. Tsirkin

From: Kangjie Xu 

Support queue_enable in vhost-kernel scenario. It can be called when
a vq reset operation has been performed and the vq is restared.

It should be noted that we can restart the vq when the vhost has
already started. When launching a new vhost device, the vhost is not
started and all vqs are not initalized until VIRTIO_PCI_COMMON_STATUS
is written. Thus, we should use vhost_started to differentiate the
two cases: vq reset and device start.

Currently it only supports vhost-kernel.

Signed-off-by: Kangjie Xu 
Signed-off-by: Xuan Zhuo 
Acked-by: Jason Wang 
Message-Id: <20221017092558.111082-14-xuanz...@linux.alibaba.com>
Reviewed-by: Michael S. Tsirkin 
Signed-off-by: Michael S. Tsirkin 
---
 hw/net/virtio-net.c | 21 +
 1 file changed, 21 insertions(+)

diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c
index 34fb4b1423..e68daf51bb 100644
--- a/hw/net/virtio-net.c
+++ b/hw/net/virtio-net.c
@@ -563,6 +563,26 @@ static void virtio_net_queue_reset(VirtIODevice *vdev, 
uint32_t queue_index)
 flush_or_purge_queued_packets(nc);
 }
 
+static void virtio_net_queue_enable(VirtIODevice *vdev, uint32_t queue_index)
+{
+VirtIONet *n = VIRTIO_NET(vdev);
+NetClientState *nc = qemu_get_subqueue(n->nic, vq2q(queue_index));
+int r;
+
+if (!nc->peer || !vdev->vhost_started) {
+return;
+}
+
+if (get_vhost_net(nc->peer) &&
+nc->peer->info->type == NET_CLIENT_DRIVER_TAP) {
+r = vhost_net_virtqueue_restart(vdev, nc, queue_index);
+if (r < 0) {
+error_report("unable to restart vhost net virtqueue: %d, "
+"when resetting the queue", queue_index);
+}
+}
+}
+
 static void virtio_net_reset(VirtIODevice *vdev)
 {
 VirtIONet *n = VIRTIO_NET(vdev);
@@ -3845,6 +3865,7 @@ static void virtio_net_class_init(ObjectClass *klass, 
void *data)
 vdc->bad_features = virtio_net_bad_features;
 vdc->reset = virtio_net_reset;
 vdc->queue_reset = virtio_net_queue_reset;
+vdc->queue_enable = virtio_net_queue_enable;
 vdc->set_status = virtio_net_set_status;
 vdc->guest_notifier_mask = virtio_net_guest_notifier_mask;
 vdc->guest_notifier_pending = virtio_net_guest_notifier_pending;
-- 
MST

[PULL v2 48/82] vhost: vhost-kernel: enable vq reset feature

2022-11-02 Thread Michael S. Tsirkin

From: Kangjie Xu 

Add virtqueue reset feature for vhost-kernel.

Signed-off-by: Kangjie Xu 
Signed-off-by: Xuan Zhuo 
Acked-by: Jason Wang 
Message-Id: <20221017092558.111082-15-xuanz...@linux.alibaba.com>
Reviewed-by: Michael S. Tsirkin 
Signed-off-by: Michael S. Tsirkin 
---
 hw/net/vhost_net.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/hw/net/vhost_net.c b/hw/net/vhost_net.c
index f2ada02781..a6a130e1ae 100644
--- a/hw/net/vhost_net.c
+++ b/hw/net/vhost_net.c
@@ -47,6 +47,7 @@ static const int kernel_feature_bits[] = {
 VIRTIO_NET_F_MTU,
 VIRTIO_F_IOMMU_PLATFORM,
 VIRTIO_F_RING_PACKED,
+VIRTIO_F_RING_RESET,
 VIRTIO_NET_F_HASH_REPORT,
 VHOST_INVALID_FEATURE_BIT
 };
-- 
MST

[PULL v2 41/82] vhost: expose vhost_virtqueue_start()

2022-11-02 Thread Michael S. Tsirkin

From: Kangjie Xu 

Expose vhost_virtqueue_start(), we need to use it when restarting a
virtqueue.

Signed-off-by: Kangjie Xu 
Signed-off-by: Xuan Zhuo 
Acked-by: Jason Wang 
Message-Id: <20221017092558.111082-8-xuanz...@linux.alibaba.com>
Reviewed-by: Michael S. Tsirkin 
Signed-off-by: Michael S. Tsirkin 
---
 include/hw/virtio/vhost.h | 3 +++
 hw/virtio/vhost.c | 8 
 2 files changed, 7 insertions(+), 4 deletions(-)

diff --git a/include/hw/virtio/vhost.h b/include/hw/virtio/vhost.h
index d7eb557885..0054a695dc 100644
--- a/include/hw/virtio/vhost.h
+++ b/include/hw/virtio/vhost.h
@@ -297,6 +297,9 @@ int vhost_net_set_backend(struct vhost_dev *hdev,
 
 int vhost_device_iotlb_miss(struct vhost_dev *dev, uint64_t iova, int write);
 
+int vhost_virtqueue_start(struct vhost_dev *dev, struct VirtIODevice *vdev,
+  struct vhost_virtqueue *vq, unsigned idx);
+
 void vhost_dev_reset_inflight(struct vhost_inflight *inflight);
 void vhost_dev_free_inflight(struct vhost_inflight *inflight);
 void vhost_dev_save_inflight(struct vhost_inflight *inflight, QEMUFile *f);
diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c
index 5185c15295..788d0a0679 100644
--- a/hw/virtio/vhost.c
+++ b/hw/virtio/vhost.c
@@ -1081,10 +1081,10 @@ out:
 return ret;
 }
 
-static int vhost_virtqueue_start(struct vhost_dev *dev,
-struct VirtIODevice *vdev,
-struct vhost_virtqueue *vq,
-unsigned idx)
+int vhost_virtqueue_start(struct vhost_dev *dev,
+  struct VirtIODevice *vdev,
+  struct vhost_virtqueue *vq,
+  unsigned idx)
 {
 BusState *qbus = BUS(qdev_get_parent_bus(DEVICE(vdev)));
 VirtioBusState *vbus = VIRTIO_BUS(qbus);
-- 
MST

[PULL v2 18/82] bios-tables-test: teach test to use smbios 3.0 tables

2022-11-02 Thread Michael S. Tsirkin

From: Julia Suvorova 

Introduce the 64-bit entry point. Since we no longer have a total
number of structures, stop checking for the new ones at the EOF
structure (type 127).

Signed-off-by: Julia Suvorova 
Reviewed-by: Igor Mammedov 
Message-Id: <20220731162141.178443-3-jus...@redhat.com>
Message-Id: <2022101731.101412-3-jus...@redhat.com>
Reviewed-by: Michael S. Tsirkin 
Signed-off-by: Michael S. Tsirkin 
---
 tests/qtest/bios-tables-test.c | 100 +
 1 file changed, 76 insertions(+), 24 deletions(-)

diff --git a/tests/qtest/bios-tables-test.c b/tests/qtest/bios-tables-test.c
index e6096e7f73..0db6630772 100644
--- a/tests/qtest/bios-tables-test.c
+++ b/tests/qtest/bios-tables-test.c
@@ -88,8 +88,8 @@ typedef struct {
 uint64_t rsdp_addr;
 uint8_t rsdp_table[36 /* ACPI 2.0+ RSDP size */];
 GArray *tables;
-uint32_t smbios_ep_addr;
-struct smbios_21_entry_point smbios_ep_table;
+uint64_t smbios_ep_addr[SMBIOS_ENTRY_POINT_TYPE__MAX];
+SmbiosEntryPoint smbios_ep_table;
 uint16_t smbios_cpu_max_speed;
 uint16_t smbios_cpu_curr_speed;
 uint8_t *required_struct_types;
@@ -533,10 +533,9 @@ static void test_acpi_asl(test_data *data)
 free_test_data(_data);
 }
 
-static bool smbios_ep_table_ok(test_data *data)
+static bool smbios_ep2_table_ok(test_data *data, uint32_t addr)
 {
-struct smbios_21_entry_point *ep_table = >smbios_ep_table;
-uint32_t addr = data->smbios_ep_addr;
+struct smbios_21_entry_point *ep_table = >smbios_ep_table.ep21;
 
 qtest_memread(data->qts, addr, ep_table, sizeof(*ep_table));
 if (memcmp(ep_table->anchor_string, "_SM_", 4)) {
@@ -559,13 +558,29 @@ static bool smbios_ep_table_ok(test_data *data)
 return true;
 }
 
-static void test_smbios_entry_point(test_data *data)
+static bool smbios_ep3_table_ok(test_data *data, uint64_t addr)
+{
+struct smbios_30_entry_point *ep_table = >smbios_ep_table.ep30;
+
+qtest_memread(data->qts, addr, ep_table, sizeof(*ep_table));
+if (memcmp(ep_table->anchor_string, "_SM3_", 5)) {
+return false;
+}
+
+if (acpi_calc_checksum((uint8_t *)ep_table, sizeof *ep_table)) {
+return false;
+}
+
+return true;
+}
+
+static SmbiosEntryPointType test_smbios_entry_point(test_data *data)
 {
 uint32_t off;
 
 /* find smbios entry point structure */
 for (off = 0xf; off < 0x10; off += 0x10) {
-uint8_t sig[] = "_SM_";
+uint8_t sig[] = "_SM_", sig3[] = "_SM3_";
 int i;
 
 for (i = 0; i < sizeof sig - 1; ++i) {
@@ -574,14 +589,30 @@ static void test_smbios_entry_point(test_data *data)
 
 if (!memcmp(sig, "_SM_", sizeof sig)) {
 /* signature match, but is this a valid entry point? */
-data->smbios_ep_addr = off;
-if (smbios_ep_table_ok(data)) {
+if (smbios_ep2_table_ok(data, off)) {
+data->smbios_ep_addr[SMBIOS_ENTRY_POINT_TYPE_32] = off;
+}
+}
+
+for (i = 0; i < sizeof sig3 - 1; ++i) {
+sig3[i] = qtest_readb(data->qts, off + i);
+}
+
+if (!memcmp(sig3, "_SM3_", sizeof sig3)) {
+if (smbios_ep3_table_ok(data, off)) {
+data->smbios_ep_addr[SMBIOS_ENTRY_POINT_TYPE_64] = off;
+/* found 64-bit entry point, no need to look for 32-bit one */
 break;
 }
 }
 }
 
-g_assert_cmphex(off, <, 0x10);
+/* found at least one entry point */
+g_assert_true(data->smbios_ep_addr[SMBIOS_ENTRY_POINT_TYPE_32] ||
+  data->smbios_ep_addr[SMBIOS_ENTRY_POINT_TYPE_64]);
+
+return data->smbios_ep_addr[SMBIOS_ENTRY_POINT_TYPE_64] ?
+   SMBIOS_ENTRY_POINT_TYPE_64 : SMBIOS_ENTRY_POINT_TYPE_32;
 }
 
 static inline bool smbios_single_instance(uint8_t type)
@@ -625,16 +656,23 @@ static bool smbios_cpu_test(test_data *data, uint32_t 
addr)
 return true;
 }
 
-static void test_smbios_structs(test_data *data)
+static void test_smbios_structs(test_data *data, SmbiosEntryPointType ep_type)
 {
 DECLARE_BITMAP(struct_bitmap, SMBIOS_MAX_TYPE+1) = { 0 };
-struct smbios_21_entry_point *ep_table = >smbios_ep_table;
-uint32_t addr = le32_to_cpu(ep_table->structure_table_address);
-int i, len, max_len = 0;
+
+SmbiosEntryPoint *ep_table = >smbios_ep_table;
+int i = 0, len, max_len = 0;
 uint8_t type, prv, crt;
+uint64_t addr;
+
+if (ep_type == SMBIOS_ENTRY_POINT_TYPE_32) {
+addr = le32_to_cpu(ep_table->ep21.structure_table_address);
+} else {
+addr = le64_to_cpu(ep_table->ep30.structure_table_address);
+}
 
 /* walk the smbios tables */
-for (i = 0; i < le16_to_cpu(ep_table->number_of_structures); i++) {
+do {
 
 /* grab type and formatted area length from struct header */
 type = qtest_readb(data->qts, addr);
@@ -660,19 +698,33 @@ static void test_smbios_structs(test_data *data)
 }

[PULL v2 52/82] acpi: pc: vga: use AcpiDevAmlIf interface to build VGA device descriptors

2022-11-02 Thread Michael S. Tsirkin

From: Igor Mammedov 

Signed-off-by: Igor Mammedov 
Message-Id: <20221017102146.2254096-2-imamm...@redhat.com>
Reviewed-by: Michael S. Tsirkin 
Signed-off-by: Michael S. Tsirkin 
NB: we do not expect any functional change in
any ACPI tables with this change. It's only a refactoring.

Reviewed-by: Ani Sinha 
---
 hw/display/vga_int.h   |  2 ++
 hw/display/acpi-vga-stub.c |  7 +++
 hw/display/acpi-vga.c  | 26 ++
 hw/display/vga-pci.c   |  4 
 hw/i386/acpi-build.c   | 26 +-
 hw/display/meson.build | 17 +
 6 files changed, 57 insertions(+), 25 deletions(-)
 create mode 100644 hw/display/acpi-vga-stub.c
 create mode 100644 hw/display/acpi-vga.c

diff --git a/hw/display/vga_int.h b/hw/display/vga_int.h
index 305e700014..330406ad9c 100644
--- a/hw/display/vga_int.h
+++ b/hw/display/vga_int.h
@@ -30,6 +30,7 @@
 #include "ui/console.h"
 
 #include "hw/display/bochs-vbe.h"
+#include "hw/acpi/acpi_aml_interface.h"
 
 #define ST01_V_RETRACE  0x08
 #define ST01_DISP_ENABLE0x01
@@ -195,4 +196,5 @@ void pci_std_vga_mmio_region_init(VGACommonState *s,
   MemoryRegion *subs,
   bool qext, bool edid);
 
+void build_vga_aml(AcpiDevAmlIf *adev, Aml *scope);
 #endif
diff --git a/hw/display/acpi-vga-stub.c b/hw/display/acpi-vga-stub.c
new file mode 100644
index 00..a9b0ecf76d
--- /dev/null
+++ b/hw/display/acpi-vga-stub.c
@@ -0,0 +1,7 @@
+#include "qemu/osdep.h"
+#include "hw/acpi/acpi_aml_interface.h"
+#include "vga_int.h"
+
+void build_vga_aml(AcpiDevAmlIf *adev, Aml *scope)
+{
+}
diff --git a/hw/display/acpi-vga.c b/hw/display/acpi-vga.c
new file mode 100644
index 00..f0e9ef1fcf
--- /dev/null
+++ b/hw/display/acpi-vga.c
@@ -0,0 +1,26 @@
+#include "qemu/osdep.h"
+#include "hw/acpi/acpi_aml_interface.h"
+#include "hw/pci/pci.h"
+#include "vga_int.h"
+
+void build_vga_aml(AcpiDevAmlIf *adev, Aml *scope)
+{
+int s3d = 0;
+Aml *method;
+
+if (object_dynamic_cast(OBJECT(adev), "qxl-vga")) {
+s3d = 3;
+}
+
+method = aml_method("_S1D", 0, AML_NOTSERIALIZED);
+aml_append(method, aml_return(aml_int(0)));
+aml_append(scope, method);
+
+method = aml_method("_S2D", 0, AML_NOTSERIALIZED);
+aml_append(method, aml_return(aml_int(0)));
+aml_append(scope, method);
+
+method = aml_method("_S3D", 0, AML_NOTSERIALIZED);
+aml_append(method, aml_return(aml_int(s3d)));
+aml_append(scope, method);
+}
diff --git a/hw/display/vga-pci.c b/hw/display/vga-pci.c
index 3e5bc259f7..9a91de7ed1 100644
--- a/hw/display/vga-pci.c
+++ b/hw/display/vga-pci.c
@@ -35,6 +35,7 @@
 #include "hw/loader.h"
 #include "hw/display/edid.h"
 #include "qom/object.h"
+#include "hw/acpi/acpi_aml_interface.h"
 
 enum vga_pci_flags {
 PCI_VGA_FLAG_ENABLE_MMIO = 1,
@@ -354,11 +355,13 @@ static void vga_pci_class_init(ObjectClass *klass, void 
*data)
 {
 DeviceClass *dc = DEVICE_CLASS(klass);
 PCIDeviceClass *k = PCI_DEVICE_CLASS(klass);
+AcpiDevAmlIfClass *adevc = ACPI_DEV_AML_IF_CLASS(klass);
 
 k->vendor_id = PCI_VENDOR_ID_QEMU;
 k->device_id = PCI_DEVICE_ID_QEMU_VGA;
 dc->vmsd = _vga_pci;
 set_bit(DEVICE_CATEGORY_DISPLAY, dc->categories);
+adevc->build_dev_aml = build_vga_aml;
 }
 
 static const TypeInfo vga_pci_type_info = {
@@ -369,6 +372,7 @@ static const TypeInfo vga_pci_type_info = {
 .class_init = vga_pci_class_init,
 .interfaces = (InterfaceInfo[]) {
 { INTERFACE_CONVENTIONAL_PCI_DEVICE },
+{ TYPE_ACPI_DEV_AML_IF },
 { },
 },
 };
diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c
index 4f54b61904..26932b4e2c 100644
--- a/hw/i386/acpi-build.c
+++ b/hw/i386/acpi-build.c
@@ -430,7 +430,6 @@ static void build_append_pci_bus_devices(Aml *parent_scope, 
PCIBus *bus,
 bool hotpluggbale_slot = false;
 bool bridge_in_acpi = false;
 bool cold_plugged_bridge = false;
-bool is_vga = false;
 
 if (pdev) {
 pc = PCI_DEVICE_GET_CLASS(pdev);
@@ -440,8 +439,6 @@ static void build_append_pci_bus_devices(Aml *parent_scope, 
PCIBus *bus,
 continue;
 }
 
-is_vga = pc->class_id == PCI_CLASS_DISPLAY_VGA;
-
 /*
  * Cold plugged bridges aren't themselves hot-pluggable.
  * Hotplugged bridges *are* hot-pluggable.
@@ -489,28 +486,7 @@ static void build_append_pci_bus_devices(Aml 
*parent_scope, PCIBus *bus,
 aml_append(dev, aml_pci_device_dsm());
 }
 
-if (is_vga) {
-/* add VGA specific AML methods */
-int s3d;
-
-if (object_dynamic_cast(OBJECT(pdev), "qxl-vga")) {
-s3d = 3;
-} else {
-s3d = 0;
-}
-
-method = aml_method("_S1D", 0, AML_NOTSERIALIZED);
-aml_append(method, aml_return(aml_int(0)));
-

[PULL v2 78/82] vfio: move implement of vfio_get_xlat_addr() to memory.c

2022-11-02 Thread Michael S. Tsirkin

From: Cindy Lu 

- Move the implement vfio_get_xlat_addr to softmmu/memory.c, and
  change the name to memory_get_xlat_addr(). So we can use this
  function on other devices, such as vDPA device.
- Add a new function vfio_get_xlat_addr in vfio/common.c, and it will check
  whether the memory is backed by a discard manager. then device can
  have its own warning.

Signed-off-by: Cindy Lu 
Message-Id: <20221031031020.1405111-2-l...@redhat.com>
Acked-by: Alex Williamson 
Reviewed-by: Michael S. Tsirkin 
Signed-off-by: Michael S. Tsirkin 
---
 include/exec/memory.h |  4 +++
 hw/vfio/common.c  | 66 +++
 softmmu/memory.c  | 72 +++
 3 files changed, 81 insertions(+), 61 deletions(-)

diff --git a/include/exec/memory.h b/include/exec/memory.h
index bfb1de8eea..d1e79c39dc 100644
--- a/include/exec/memory.h
+++ b/include/exec/memory.h
@@ -713,6 +713,10 @@ void 
ram_discard_manager_register_listener(RamDiscardManager *rdm,
 void ram_discard_manager_unregister_listener(RamDiscardManager *rdm,
  RamDiscardListener *rdl);
 
+bool memory_get_xlat_addr(IOMMUTLBEntry *iotlb, void **vaddr,
+  ram_addr_t *ram_addr, bool *read_only,
+  bool *mr_has_discard_manager);
+
 typedef struct CoalescedMemoryRange CoalescedMemoryRange;
 typedef struct MemoryRegionIoeventfd MemoryRegionIoeventfd;
 
diff --git a/hw/vfio/common.c b/hw/vfio/common.c
index 6b5d8c0bf6..130e5d1dc7 100644
--- a/hw/vfio/common.c
+++ b/hw/vfio/common.c
@@ -578,45 +578,11 @@ static bool 
vfio_listener_skipped_section(MemoryRegionSection *section)
 static bool vfio_get_xlat_addr(IOMMUTLBEntry *iotlb, void **vaddr,
ram_addr_t *ram_addr, bool *read_only)
 {
-MemoryRegion *mr;
-hwaddr xlat;
-hwaddr len = iotlb->addr_mask + 1;
-bool writable = iotlb->perm & IOMMU_WO;
-
-/*
- * The IOMMU TLB entry we have just covers translation through
- * this IOMMU to its immediate target.  We need to translate
- * it the rest of the way through to memory.
- */
-mr = address_space_translate(_space_memory,
- iotlb->translated_addr,
- , , writable,
- MEMTXATTRS_UNSPECIFIED);
-if (!memory_region_is_ram(mr)) {
-error_report("iommu map to non memory area %"HWADDR_PRIx"",
- xlat);
-return false;
-} else if (memory_region_has_ram_discard_manager(mr)) {
-RamDiscardManager *rdm = memory_region_get_ram_discard_manager(mr);
-MemoryRegionSection tmp = {
-.mr = mr,
-.offset_within_region = xlat,
-.size = int128_make64(len),
-};
-
-/*
- * Malicious VMs can map memory into the IOMMU, which is expected
- * to remain discarded. vfio will pin all pages, populating memory.
- * Disallow that. vmstate priorities make sure any RamDiscardManager
- * were already restored before IOMMUs are restored.
- */
-if (!ram_discard_manager_is_populated(rdm, )) {
-error_report("iommu map to discarded memory (e.g., unplugged via"
- " virtio-mem): %"HWADDR_PRIx"",
- iotlb->translated_addr);
-return false;
-}
+bool ret, mr_has_discard_manager;
 
+ret = memory_get_xlat_addr(iotlb, vaddr, ram_addr, read_only,
+   _has_discard_manager);
+if (ret && mr_has_discard_manager) {
 /*
  * Malicious VMs might trigger discarding of IOMMU-mapped memory. The
  * pages will remain pinned inside vfio until unmapped, resulting in a
@@ -635,29 +601,7 @@ static bool vfio_get_xlat_addr(IOMMUTLBEntry *iotlb, void 
**vaddr,
  " intended via an IOMMU. It's possible to mitigate "
  " by setting/adjusting RLIMIT_MEMLOCK.");
 }
-
-/*
- * Translation truncates length to the IOMMU page size,
- * check that it did not truncate too much.
- */
-if (len & iotlb->addr_mask) {
-error_report("iommu has granularity incompatible with target AS");
-return false;
-}
-
-if (vaddr) {
-*vaddr = memory_region_get_ram_ptr(mr) + xlat;
-}
-
-if (ram_addr) {
-*ram_addr = memory_region_get_ram_addr(mr) + xlat;
-}
-
-if (read_only) {
-*read_only = !writable || mr->readonly;
-}
-
-return true;
+return ret;
 }
 
 static void vfio_iommu_map_notify(IOMMUNotifier *n, IOMMUTLBEntry *iotlb)
diff --git a/softmmu/memory.c b/softmmu/memory.c
index 7ba2048836..bc0be3f62c 100644
--- a/softmmu/memory.c
+++ b/softmmu/memory.c
@@ -33,6 +33,7 @@
 #include "qemu/accel.h"
 #include "hw/boards.h"
 #include "migration/vmstate.h"
+#include "exec/address-spaces.h"
 
 //#define DEBUG_UNASSIGNED
 
@@ -2121,6

[PULL v2 44/82] vhost-net: vhost-kernel: introduce vhost_net_virtqueue_restart()

2022-11-02 Thread Michael S. Tsirkin

From: Kangjie Xu 

Introduce vhost_net_virtqueue_restart(), which can restart the
specific virtqueue when the vhost net started running before.
If it fails to restart the virtqueue, the device will be stopped.

Here we do not reuse vhost_net_start_one() or vhost_dev_start()
because they work at queue pair level. The mem table and features
do not change, so we can call the vhost_virtqueue_start() to
restart a specific queue.

This patch only considers the case of vhost-kernel, when
NetClientDriver is NET_CLIENT_DRIVER_TAP.

Signed-off-by: Kangjie Xu 
Signed-off-by: Xuan Zhuo 
Acked-by: Jason Wang 
Message-Id: <20221017092558.111082-11-xuanz...@linux.alibaba.com>
Reviewed-by: Michael S. Tsirkin 
Signed-off-by: Michael S. Tsirkin 
---
 include/net/vhost_net.h |  2 ++
 hw/net/vhost_net-stub.c |  6 +
 hw/net/vhost_net.c  | 53 +
 3 files changed, 61 insertions(+)

diff --git a/include/net/vhost_net.h b/include/net/vhost_net.h
index 85d85a4957..40b9a40074 100644
--- a/include/net/vhost_net.h
+++ b/include/net/vhost_net.h
@@ -50,4 +50,6 @@ int vhost_net_set_mtu(struct vhost_net *net, uint16_t mtu);
 
 void vhost_net_virtqueue_reset(VirtIODevice *vdev, NetClientState *nc,
int vq_index);
+int vhost_net_virtqueue_restart(VirtIODevice *vdev, NetClientState *nc,
+int vq_index);
 #endif
diff --git a/hw/net/vhost_net-stub.c b/hw/net/vhost_net-stub.c
index 2d745e359c..9f7daae99c 100644
--- a/hw/net/vhost_net-stub.c
+++ b/hw/net/vhost_net-stub.c
@@ -107,3 +107,9 @@ void vhost_net_virtqueue_reset(VirtIODevice *vdev, 
NetClientState *nc,
 {
 
 }
+
+int vhost_net_virtqueue_restart(VirtIODevice *vdev, NetClientState *nc,
+int vq_index)
+{
+return 0;
+}
diff --git a/hw/net/vhost_net.c b/hw/net/vhost_net.c
index 519dced899..f2ada02781 100644
--- a/hw/net/vhost_net.c
+++ b/hw/net/vhost_net.c
@@ -34,6 +34,7 @@
 #include "standard-headers/linux/virtio_ring.h"
 #include "hw/virtio/vhost.h"
 #include "hw/virtio/virtio-bus.h"
+#include "linux-headers/linux/vhost.h"
 
 
 /* Features supported by host kernel. */
@@ -556,3 +557,55 @@ void vhost_net_virtqueue_reset(VirtIODevice *vdev, 
NetClientState *nc,
  net->dev.vqs + idx,
  net->dev.vq_index + idx);
 }
+
+int vhost_net_virtqueue_restart(VirtIODevice *vdev, NetClientState *nc,
+int vq_index)
+{
+VHostNetState *net = get_vhost_net(nc->peer);
+const VhostOps *vhost_ops = net->dev.vhost_ops;
+struct vhost_vring_file file = { };
+int idx, r;
+
+if (!net->dev.started) {
+return -EBUSY;
+}
+
+/* should only be called after backend is connected */
+assert(vhost_ops);
+
+idx = vhost_ops->vhost_get_vq_index(>dev, vq_index);
+
+r = vhost_virtqueue_start(>dev,
+  vdev,
+  net->dev.vqs + idx,
+  net->dev.vq_index + idx);
+if (r < 0) {
+goto err_start;
+}
+
+if (net->nc->info->type == NET_CLIENT_DRIVER_TAP) {
+file.index = idx;
+file.fd = net->backend;
+r = vhost_net_set_backend(>dev, );
+if (r < 0) {
+r = -errno;
+goto err_start;
+}
+}
+
+return 0;
+
+err_start:
+error_report("Error when restarting the queue.");
+
+if (net->nc->info->type == NET_CLIENT_DRIVER_TAP) {
+file.fd = VHOST_FILE_UNBIND;
+file.index = idx;
+int r = vhost_net_set_backend(>dev, );
+assert(r >= 0);
+}
+
+vhost_dev_stop(>dev, vdev);
+
+return r;
+}
-- 
MST

[PULL v2 63/82] hw/acpi/erst.c: Fix memory handling issues

2022-11-02 Thread Michael S. Tsirkin

From: "Christian A. Ehrhardt" 

- Fix memset argument order: The second argument is
  the value, the length goes last.
- Fix an integer overflow reported by Alexander Bulekov.

Both issues allow the guest to overrun the host buffer
allocated for the ERST memory device.

Cc: Eric DeVolder 
Cc: qemu-sta...@nongnu.org
Fixes: f7e26ffa590 ("ACPI ERST: support for ACPI ERST feature")
Tested-by: Alexander Bulekov 
Signed-off-by: Christian A. Ehrhardt 
Message-Id: <20221024154233.1043347-1...@c--e.de>
Fixes: https://gitlab.com/qemu-project/qemu/-/issues/1268
Reviewed-by: Alexander Bulekov 
Reviewed-by: Eric DeVolder 
Reviewed-by: Michael S. Tsirkin 
Signed-off-by: Michael S. Tsirkin 
---
 hw/acpi/erst.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/hw/acpi/erst.c b/hw/acpi/erst.c
index df856b2669..aefcc03ad6 100644
--- a/hw/acpi/erst.c
+++ b/hw/acpi/erst.c
@@ -635,7 +635,7 @@ static unsigned read_erst_record(ERSTDeviceState *s)
 if (record_length < UEFI_CPER_RECORD_MIN_SIZE) {
 rc = STATUS_FAILED;
 }
-if ((s->record_offset + record_length) > exchange_length) {
+if (record_length > exchange_length - s->record_offset) {
 rc = STATUS_FAILED;
 }
 /* If all is ok, copy the record to the exchange buffer */
@@ -684,7 +684,7 @@ static unsigned write_erst_record(ERSTDeviceState *s)
 if (record_length < UEFI_CPER_RECORD_MIN_SIZE) {
 return STATUS_FAILED;
 }
-if ((s->record_offset + record_length) > exchange_length) {
+if (record_length > exchange_length - s->record_offset) {
 return STATUS_FAILED;
 }
 
@@ -716,7 +716,7 @@ static unsigned write_erst_record(ERSTDeviceState *s)
 if (nvram) {
 /* Write the record into the slot */
 memcpy(nvram, exchange, record_length);
-memset(nvram + record_length, exchange_length - record_length, 0xFF);
+memset(nvram + record_length, 0xFF, exchange_length - record_length);
 /* If a new record, increment the record_count */
 if (!record_found) {
 uint32_t record_count;
-- 
MST

Re: [PATCH v5] linux-user: Add close_range() syscall

2022-11-02 Thread Laurent Vivier


Le 25/10/2022 à 04:34, Helge Deller a écrit :

Signed-off-by: Helge Deller 
---
Changes:
v5: Simplify check of arg2 against target_fd_max even more
v4: Fix check of arg2
v3: fd_trans_unregister() only called if close_range() doesn't fail
v2: consider CLOSE_RANGE_CLOEXEC flag

diff --git a/linux-user/strace.list b/linux-user/strace.list
index a87415bf3d..78796266e8 100644
--- a/linux-user/strace.list
+++ b/linux-user/strace.list
@@ -103,6 +103,9 @@
  #ifdef TARGET_NR_close
  { TARGET_NR_close, "close" , "%s(%d)", NULL, NULL },
  #endif
+#ifdef TARGET_NR_close_range
+{ TARGET_NR_close_range, "close_range" , "%s(%u,%u,%u)", NULL, NULL },
+#endif
  #ifdef TARGET_NR_connect
  { TARGET_NR_connect, "connect" , "%s(%d,%#x,%d)", NULL, NULL },
  #endif
diff --git a/linux-user/syscall.c b/linux-user/syscall.c
index 2e954d8dbd..c51d619a5c 100644
--- a/linux-user/syscall.c
+++ b/linux-user/syscall.c
@@ -338,6 +338,13 @@ _syscall3(int,sys_syslog,int,type,char*,bufp,int,len)
  #ifdef __NR_exit_group
  _syscall1(int,exit_group,int,error_code)
  #endif
+#if defined(__NR_close_range) && defined(TARGET_NR_close_range)
+#define __NR_sys_close_range __NR_close_range
+_syscall3(int,sys_close_range,int,first,int,last,int,flags)
+#ifndef CLOSE_RANGE_CLOEXEC
+#define CLOSE_RANGE_CLOEXEC (1U << 2)
+#endif
+#endif
  #if defined(__NR_futex)
  _syscall6(int,sys_futex,int *,uaddr,int,op,int,val,
const struct timespec *,timeout,int *,uaddr2,int,val3)
@@ -8699,6 +8706,18 @@ static abi_long do_syscall1(CPUArchState *cpu_env, int 
num, abi_long arg1,
  case TARGET_NR_close:
  fd_trans_unregister(arg1);
  return get_errno(close(arg1));
+#if defined(__NR_close_range) && defined(TARGET_NR_close_range)
+case TARGET_NR_close_range:
+ret = get_errno(sys_close_range(arg1, arg2, arg3));
+if (ret == 0 && !(arg3 & CLOSE_RANGE_CLOEXEC)) {
+abi_long fd, maxfd;
+maxfd = MIN(arg2, target_fd_max);
+for (fd = arg1; fd < maxfd; fd++) {
+fd_trans_unregister(fd);
+}
+}
+return ret;
+#endif

  case TARGET_NR_brk:
  return do_brk(arg1);



Applied to my linux-user-for-7.2 branch.

Thanks,
Laurent

[PULL v2 45/82] virtio-net: introduce flush_or_purge_queued_packets()

2022-11-02 Thread Michael S. Tsirkin

From: Kangjie Xu 

Introduce the fucntion flush_or_purge_queued_packets(), it will be
used in device reset and virtqueue reset. Therefore, we extract the
common logic as a new function.

Signed-off-by: Kangjie Xu 
Signed-off-by: Xuan Zhuo 
Acked-by: Jason Wang 
Message-Id: <20221017092558.111082-12-xuanz...@linux.alibaba.com>
Reviewed-by: Michael S. Tsirkin 
Signed-off-by: Michael S. Tsirkin 
---
 hw/net/virtio-net.c | 17 +++--
 1 file changed, 11 insertions(+), 6 deletions(-)

diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c
index b6903aea54..038a6fba7c 100644
--- a/hw/net/virtio-net.c
+++ b/hw/net/virtio-net.c
@@ -124,6 +124,16 @@ static int vq2q(int queue_index)
 return queue_index / 2;
 }
 
+static void flush_or_purge_queued_packets(NetClientState *nc)
+{
+if (!nc->peer) {
+return;
+}
+
+qemu_flush_or_purge_queued_packets(nc->peer, true);
+assert(!virtio_net_get_subqueue(nc)->async_tx.elem);
+}
+
 /* TODO
  * - we could suppress RX interrupt if we were so inclined.
  */
@@ -566,12 +576,7 @@ static void virtio_net_reset(VirtIODevice *vdev)
 
 /* Flush any async TX */
 for (i = 0;  i < n->max_queue_pairs; i++) {
-NetClientState *nc = qemu_get_subqueue(n->nic, i);
-
-if (nc->peer) {
-qemu_flush_or_purge_queued_packets(nc->peer, true);
-assert(!virtio_net_get_subqueue(nc)->async_tx.elem);
-}
+flush_or_purge_queued_packets(qemu_get_subqueue(n->nic, i));
 }
 }
 
-- 
MST

[PULL v2 76/82] tests: acpi: aarch64/virt: add a test for hmat nodes with no initiators

2022-11-02 Thread Michael S. Tsirkin

From: Hesham Almatary 

This patch imitates the "tests: acpi: q35: add test for hmat nodes
without initiators" commit to test numa nodes with different HMAT
attributes, but on AArch64/virt.

Tested with:
qemu-system-aarch64 -accel tcg \
-machine virt,hmat=on,gic-version=3  -cpu cortex-a57 \
-bios qemu-efi-aarch64/QEMU_EFI.fd \
-kernel Image -append "root=/dev/vda2 console=ttyAMA0" \
-drive if=virtio,file=aarch64.qcow2,format=qcow2,id=hd \
-device virtio-rng-pci \
-net user,hostfwd=tcp::10022-:22 -net nic \
-device intel-hda -device hda-duplex -nographic \
-smp 4 \
-m 3G \
-object memory-backend-ram,size=1G,id=ram0 \
-object memory-backend-ram,size=1G,id=ram1 \
-object memory-backend-ram,size=1G,id=ram2 \
-numa node,nodeid=0,memdev=ram0,cpus=0-1 \
-numa node,nodeid=1,memdev=ram1,cpus=2-3 \
-numa node,nodeid=2,memdev=ram2 \
-numa
hmat-lb,initiator=0,target=0,hierarchy=memory,data-type=access-latency,latency=10
 \
-numa 
hmat-lb,initiator=0,target=0,hierarchy=memory,data-type=access-bandwidth,bandwidth=10485760
 \
-numa 
hmat-lb,initiator=0,target=1,hierarchy=memory,data-type=access-latency,latency=20
 \
-numa 
hmat-lb,initiator=0,target=1,hierarchy=memory,data-type=access-bandwidth,bandwidth=5242880
 \
-numa 
hmat-lb,initiator=0,target=2,hierarchy=memory,data-type=access-latency,latency=30
 \
-numa 
hmat-lb,initiator=0,target=2,hierarchy=memory,data-type=access-bandwidth,bandwidth=1048576
 \
-numa 
hmat-lb,initiator=1,target=0,hierarchy=memory,data-type=access-latency,latency=20
 \
-numa 
hmat-lb,initiator=1,target=0,hierarchy=memory,data-type=access-bandwidth,bandwidth=5242880
 \
-numa 
hmat-lb,initiator=1,target=1,hierarchy=memory,data-type=access-latency,latency=10
 \
-numa 
hmat-lb,initiator=1,target=1,hierarchy=memory,data-type=access-bandwidth,bandwidth=10485760
 \
-numa 
hmat-lb,initiator=1,target=2,hierarchy=memory,data-type=access-latency,latency=30
 \
-numa 
hmat-lb,initiator=1,target=2,hierarchy=memory,data-type=access-bandwidth,bandwidth=1048576

Signed-off-by: Hesham Almatary 
Message-Id: <20221027100037.251-8-hesham.almat...@huawei.com>
Tested-by: Yicong Yang 
Reviewed-by: Michael S. Tsirkin 
Signed-off-by: Michael S. Tsirkin 
---
 tests/qtest/bios-tables-test.c | 59 ++
 1 file changed, 59 insertions(+)

diff --git a/tests/qtest/bios-tables-test.c b/tests/qtest/bios-tables-test.c
index 669432585b..395d441212 100644
--- a/tests/qtest/bios-tables-test.c
+++ b/tests/qtest/bios-tables-test.c
@@ -1543,6 +1543,63 @@ static void test_acpi_piix4_tcg_acpi_hmat(void)
 test_acpi_tcg_acpi_hmat(MACHINE_PC);
 }
 
+static void test_acpi_virt_tcg_acpi_hmat(void)
+{
+test_data data = {
+.machine = "virt",
+.tcg_only = true,
+.uefi_fl1 = "pc-bios/edk2-aarch64-code.fd",
+.uefi_fl2 = "pc-bios/edk2-arm-vars.fd",
+.cd = "tests/data/uefi-boot-images/bios-tables-test.aarch64.iso.qcow2",
+.ram_start = 0x4000ULL,
+.scan_len = 128ULL * 1024 * 1024,
+};
+
+data.variant = ".acpihmatvirt";
+
+test_acpi_one(" -machine hmat=on"
+  " -cpu cortex-a57"
+  " -smp 4,sockets=2"
+  " -m 256M"
+  " -object memory-backend-ram,size=64M,id=ram0"
+  " -object memory-backend-ram,size=64M,id=ram1"
+  " -object memory-backend-ram,size=128M,id=ram2"
+  " -numa node,nodeid=0,memdev=ram0"
+  " -numa node,nodeid=1,memdev=ram1"
+  " -numa node,nodeid=2,memdev=ram2"
+  " -numa cpu,node-id=0,socket-id=0"
+  " -numa cpu,node-id=0,socket-id=0"
+  " -numa cpu,node-id=1,socket-id=1"
+  " -numa cpu,node-id=1,socket-id=1"
+  " -numa hmat-lb,initiator=0,target=0,hierarchy=memory,"
+  "data-type=access-latency,latency=10"
+  " -numa hmat-lb,initiator=0,target=0,hierarchy=memory,"
+  "data-type=access-bandwidth,bandwidth=10485760"
+  " -numa hmat-lb,initiator=0,target=1,hierarchy=memory,"
+  "data-type=access-latency,latency=20"
+  " -numa hmat-lb,initiator=0,target=1,hierarchy=memory,"
+  "data-type=access-bandwidth,bandwidth=5242880"
+  " -numa hmat-lb,initiator=0,target=2,hierarchy=memory,"
+  "data-type=access-latency,latency=30"
+  " -numa hmat-lb,initiator=0,target=2,hierarchy=memory,"
+  "data-type=access-bandwidth,bandwidth=1048576"
+  " -numa hmat-lb,initiator=1,target=0,hierarchy=memory,"
+  "data-type=access-latency,latency=20"
+  " -numa hmat-lb,initiator=1,target=0,hierarchy=memory,"
+  "data-type=access-bandwidth,bandwidth=5242880"
+  " -numa hmat-lb,initiator=1,target=1,hierarchy=memory,"
+  "data-type=access-latency,latency=10"
+

[PULL v2 72/82] tests: acpi: q35: add test for hmat nodes without initiators

2022-11-02 Thread Michael S. Tsirkin

From: Brice Goglin 

expected HMAT:

[000h    4]Signature : "HMAT"[Heterogeneous Memory 
Attributes Table]
[004h 0004   4] Table Length : 0120
[008h 0008   1] Revision : 02
[009h 0009   1] Checksum : 4F
[00Ah 0010   6]   Oem ID : "BOCHS "
[010h 0016   8] Oem Table ID : "BXPC"
[018h 0024   4] Oem Revision : 0001
[01Ch 0028   4]  Asl Compiler ID : "BXPC"
[020h 0032   4]Asl Compiler Revision : 0001

[024h 0036   4] Reserved : 

[028h 0040   2]   Structure Type :  [Memory Proximity Domain 
Attributes]
[02Ah 0042   2] Reserved : 
[02Ch 0044   4]   Length : 0028
[030h 0048   2]Flags (decoded below) : 0001
Processor Proximity Domain Valid : 1
[032h 0050   2]Reserved1 : 
[034h 0052   4] Attached Initiator Proximity Domain : 
[038h 0056   4]  Memory Proximity Domain : 
[03Ch 0060   4]Reserved2 : 
[040h 0064   8]Reserved3 : 
[048h 0072   8]Reserved4 : 

[050h 0080   2]   Structure Type :  [Memory Proximity Domain 
Attributes]
[052h 0082   2] Reserved : 
[054h 0084   4]   Length : 0028
[058h 0088   2]Flags (decoded below) : 0001
Processor Proximity Domain Valid : 1
[05Ah 0090   2]Reserved1 : 
[05Ch 0092   4] Attached Initiator Proximity Domain : 0001
[060h 0096   4]  Memory Proximity Domain : 0001
[064h 0100   4]Reserved2 : 
[068h 0104   8]Reserved3 : 
[070h 0112   8]Reserved4 : 

[078h 0120   2]   Structure Type :  [Memory Proximity Domain 
Attributes]
[07Ah 0122   2] Reserved : 
[07Ch 0124   4]   Length : 0028
[080h 0128   2]Flags (decoded below) : 
Processor Proximity Domain Valid : 0
[082h 0130   2]Reserved1 : 
[084h 0132   4] Attached Initiator Proximity Domain : 0080
[088h 0136   4]  Memory Proximity Domain : 0002
[08Ch 0140   4]Reserved2 : 
[090h 0144   8]Reserved3 : 
[098h 0152   8]Reserved4 : 

[0A0h 0160   2]   Structure Type : 0001 [System Locality Latency 
and Bandwidth Information]
[0A2h 0162   2] Reserved : 
[0A4h 0164   4]   Length : 0040
[0A8h 0168   1]Flags (decoded below) : 00
Memory Hierarchy : 0
[0A9h 0169   1]Data Type : 00
[0AAh 0170   2]Reserved1 : 
[0ACh 0172   4] Initiator Proximity Domains # : 0002
[0B0h 0176   4]   Target Proximity Domains # : 0003
[0B4h 0180   4]Reserved2 : 
[0B8h 0184   8]  Entry Base Unit : 2710
[0C0h 0192   4] Initiator Proximity Domain List : 
[0C4h 0196   4] Initiator Proximity Domain List : 0001
[0C8h 0200   4] Target Proximity Domain List : 
[0CCh 0204   4] Target Proximity Domain List : 0001
[0D0h 0208   4] Target Proximity Domain List : 0002
[0D4h 0212   2]Entry : 0001
[0D6h 0214   2]Entry : 0002
[0D8h 0216   2]Entry : 0003
[0DAh 0218   2]Entry : 0002
[0DCh 0220   2]Entry : 0001
[0DEh 0222   2]Entry : 0003

[0E0h 0224   2]   Structure Type : 0001 [System Locality Latency 
and Bandwidth Information]
[0E2h 0226   2] Reserved : 
[0E4h 0228   4]   Length : 0040
[0E8h 0232   1]Flags (decoded below) : 00
Memory Hierarchy : 0
[0E9h 0233   1]Data Type : 03
[0EAh 0234   2]Reserved1 : 
[0ECh 0236   4] Initiator Proximity Domains # : 0002
[0F0h 0240   4]   Target Proximity Domains # : 0003
[0F4h 0244   4]Reserved2 : 
[0F8h 0248   8]  Entry Base Unit : 0001
[100h 0256   4] Initiator Proximity Domain List : 
[104h 0260   4] Initiator Proximity Domain List : 0001
[108h 0264   4] Target Proximity Domain List : 
[10Ch 0268   4] Target Proximity Domain List : 0001
[110h 0272   4] Target Proximity Domain List : 0002
[114h 0276   2]Entry : 000A
[116h 0278   2]Entry : 0005
[118h 0280   2]Entry : 0001
[11Ah 0282   2]Entry : 0005
[11Ch 0284   2]

[PULL v2 42/82] vhost: expose vhost_virtqueue_stop()

2022-11-02 Thread Michael S. Tsirkin

From: Kangjie Xu 

Expose vhost_virtqueue_stop(), we need to use it when resetting a
virtqueue.

Signed-off-by: Kangjie Xu 
Signed-off-by: Xuan Zhuo 
Acked-by: Jason Wang 
Message-Id: <20221017092558.111082-9-xuanz...@linux.alibaba.com>
Reviewed-by: Michael S. Tsirkin 
Signed-off-by: Michael S. Tsirkin 
---
 include/hw/virtio/vhost.h | 2 ++
 hw/virtio/vhost.c | 8 
 2 files changed, 6 insertions(+), 4 deletions(-)

diff --git a/include/hw/virtio/vhost.h b/include/hw/virtio/vhost.h
index 0054a695dc..353252ac3e 100644
--- a/include/hw/virtio/vhost.h
+++ b/include/hw/virtio/vhost.h
@@ -299,6 +299,8 @@ int vhost_device_iotlb_miss(struct vhost_dev *dev, uint64_t 
iova, int write);
 
 int vhost_virtqueue_start(struct vhost_dev *dev, struct VirtIODevice *vdev,
   struct vhost_virtqueue *vq, unsigned idx);
+void vhost_virtqueue_stop(struct vhost_dev *dev, struct VirtIODevice *vdev,
+  struct vhost_virtqueue *vq, unsigned idx);
 
 void vhost_dev_reset_inflight(struct vhost_inflight *inflight);
 void vhost_dev_free_inflight(struct vhost_inflight *inflight);
diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c
index 788d0a0679..d1c4c20b8c 100644
--- a/hw/virtio/vhost.c
+++ b/hw/virtio/vhost.c
@@ -1201,10 +1201,10 @@ fail_alloc_desc:
 return r;
 }
 
-static void vhost_virtqueue_stop(struct vhost_dev *dev,
-struct VirtIODevice *vdev,
-struct vhost_virtqueue *vq,
-unsigned idx)
+void vhost_virtqueue_stop(struct vhost_dev *dev,
+  struct VirtIODevice *vdev,
+  struct vhost_virtqueue *vq,
+  unsigned idx)
 {
 int vhost_vq_index = dev->vhost_ops->vhost_get_vq_index(dev, idx);
 struct vhost_vring_state state = {
-- 
MST

[PULL v2 60/82] tests: acpi: pc/q35 whitelist DSDT before \_GPE cleanup

2022-11-02 Thread Michael S. Tsirkin

From: Igor Mammedov 

Signed-off-by: Igor Mammedov 
Message-Id: <20221017102146.2254096-10-imamm...@redhat.com>
Reviewed-by: Michael S. Tsirkin 
Signed-off-by: Michael S. Tsirkin 
---
 tests/qtest/bios-tables-test-allowed-diff.h | 34 +
 1 file changed, 34 insertions(+)

diff --git a/tests/qtest/bios-tables-test-allowed-diff.h 
b/tests/qtest/bios-tables-test-allowed-diff.h
index dfb8523c8b..725a1dc798 100644
--- a/tests/qtest/bios-tables-test-allowed-diff.h
+++ b/tests/qtest/bios-tables-test-allowed-diff.h
@@ -1 +1,35 @@
 /* List of comma-separated changed AML files to ignore */
+"tests/data/acpi/pc/DSDT",
+"tests/data/acpi/pc/DSDT.bridge",
+"tests/data/acpi/pc/DSDT.ipmikcs",
+"tests/data/acpi/pc/DSDT.cphp",
+"tests/data/acpi/pc/DSDT.memhp",
+"tests/data/acpi/pc/DSDT.numamem",
+"tests/data/acpi/pc/DSDT.nohpet",
+"tests/data/acpi/pc/DSDT.dimmpxm",
+"tests/data/acpi/pc/DSDT.acpihmat",
+"tests/data/acpi/pc/DSDT.acpierst",
+"tests/data/acpi/pc/DSDT.roothp",
+"tests/data/acpi/pc/DSDT.hpbridge",
+"tests/data/acpi/pc/DSDT.hpbrroot",
+"tests/data/acpi/q35/DSDT",
+"tests/data/acpi/q35/DSDT.tis.tpm2",
+"tests/data/acpi/q35/DSDT.tis.tpm12",
+"tests/data/acpi/q35/DSDT.bridge",
+"tests/data/acpi/q35/DSDT.multi-bridge",
+"tests/data/acpi/q35/DSDT.mmio64",
+"tests/data/acpi/q35/DSDT.ipmibt",
+"tests/data/acpi/q35/DSDT.cphp",
+"tests/data/acpi/q35/DSDT.memhp",
+"tests/data/acpi/q35/DSDT.numamem",
+"tests/data/acpi/q35/DSDT.nohpet",
+"tests/data/acpi/q35/DSDT.dimmpxm",
+"tests/data/acpi/q35/DSDT.acpihmat",
+"tests/data/acpi/q35/DSDT.acpierst",
+"tests/data/acpi/q35/DSDT.applesmc",
+"tests/data/acpi/q35/DSDT.pvpanic-isa",
+"tests/data/acpi/q35/DSDT.ivrs",
+"tests/data/acpi/q35/DSDT.viot",
+"tests/data/acpi/q35/DSDT.cxl",
+"tests/data/acpi/q35/DSDT.ipmismbus",
+"tests/data/acpi/q35/DSDT.xapic",
-- 
MST

[PULL v2 66/82] hw/i386/pc.c: CXL Fixed Memory Window should not reserve e820 in bios

2022-11-02 Thread Michael S. Tsirkin

From: Gregory Price 

Early-boot e820 records will be inserted by the bios/efi/early boot
software and be reported to the kernel via insert_resource.  Later, when
CXL drivers iterate through the regions again, they will insert another
resource and make the RESERVED memory area a child.

This RESERVED memory area causes the memory region to become unusable,
and as a result attempting to create memory regions with

`cxl create-region ...`

Will fail due to the RESERVED area intersecting with the CXL window.

During boot the following traceback is observed:

0x81101650 in insert_resource_expand_to_fit ()
0x83d964c5 in e820__reserve_resources_late ()
0x83e03210 in pcibios_resource_survey ()
0x83e04f4a in pcibios_init ()

Which produces a call to reserve the CFMWS area:

(gdb) p *new
$54 = {start = 0x29000, end = 0x2cfff, name = "Reserved",
   flags = 0x200, desc = 0x7, parent = 0x0, sibling = 0x0,
   child = 0x0}

Later the Kernel parses ACPI tables and reserves the exact same area as
the CXL Fixed Memory Window:

0x811016a4 in insert_resource_conflict ()
  insert_resource ()
0x81a81389 in cxl_parse_cfmws ()
0x818c4a81 in call_handler ()
  acpi_parse_entries_array ()

(gdb) p/x *new
$59 = {start = 0x29000, end = 0x2cfff, name = "CXL Window 0",
   flags = 0x200, desc = 0x0, parent = 0x0, sibling = 0x0,
   child = 0x0}

This produces the following output in /proc/iomem:

59000-68fff : CXL Window 0
  59000-68fff : Reserved

This reserved area causes `get_free_mem_region()` to fail due to a check
against `__region_intersects()`.  Due to this reserved area, the
intersect check will only ever return REGION_INTERSECTS, which causes
`cxl create-region` to always fail.

Signed-off-by: Gregory Price 
Message-Id: <20221026205912.8579-1-gregory.pr...@memverge.com>
Reviewed-by: Michael S. Tsirkin 
Signed-off-by: Michael S. Tsirkin 
Acked-by: Jonathan Cameron 
---
 hw/i386/pc.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/hw/i386/pc.c b/hw/i386/pc.c
index ef14da5094..546b703cb4 100644
--- a/hw/i386/pc.c
+++ b/hw/i386/pc.c
@@ -1061,7 +1061,6 @@ void pc_memory_init(PCMachineState *pcms,
 hwaddr cxl_size = MiB;
 
 cxl_base = pc_get_cxl_range_start(pcms);
-e820_add_entry(cxl_base, cxl_size, E820_RESERVED);
 memory_region_init(mr, OBJECT(machine), "cxl_host_reg", cxl_size);
 memory_region_add_subregion(system_memory, cxl_base, mr);
 cxl_resv_end = cxl_base + cxl_size;
@@ -1077,7 +1076,6 @@ void pc_memory_init(PCMachineState *pcms,
 memory_region_init_io(>mr, OBJECT(machine), _ops, fw,
   "cxl-fixed-memory-region", fw->size);
 memory_region_add_subregion(system_memory, fw->base, >mr);
-e820_add_entry(fw->base, fw->size, E820_RESERVED);
 cxl_fmw_base += fw->size;
 cxl_resv_end = cxl_fmw_base;
 }
-- 
MST

[PULL v2 69/82] hw/i386/acpi-build: Resolve north rather than south bridges

2022-11-02 Thread Michael S. Tsirkin

From: Bernhard Beschow 

The code currently assumes Q35 iff ICH9 and i440fx iff PIIX. Now that more
AML generation has been moved into the south bridges and since the
machines define themselves primarily through their north bridges, let's
switch to resolving the north bridges for AML generation instead. This
also allows for easier experimentation with different south bridges in
the "pc" machine, e.g. with PIIX4 and VT82xx.

Signed-off-by: Bernhard Beschow 
Message-Id: <20221028103419.93398-4-shen...@gmail.com>
Reviewed-by: Michael S. Tsirkin 
Signed-off-by: Michael S. Tsirkin 
---
 hw/i386/acpi-build.c | 11 ++-
 1 file changed, 6 insertions(+), 5 deletions(-)

diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c
index 73d8a59737..d9eaa5fc4d 100644
--- a/hw/i386/acpi-build.c
+++ b/hw/i386/acpi-build.c
@@ -60,6 +60,7 @@
 #include "hw/i386/fw_cfg.h"
 #include "hw/i386/ich9.h"
 #include "hw/pci/pci_bus.h"
+#include "hw/pci-host/i440fx.h"
 #include "hw/pci-host/q35.h"
 #include "hw/i386/x86-iommu.h"
 
@@ -1322,8 +1323,8 @@ build_dsdt(GArray *table_data, BIOSLinker *linker,
AcpiPmInfo *pm, AcpiMiscInfo *misc,
Range *pci_hole, Range *pci_hole64, MachineState *machine)
 {
-Object *piix = object_resolve_type_unambiguous(TYPE_PIIX4_PM);
-Object *lpc = object_resolve_type_unambiguous(TYPE_ICH9_LPC_DEVICE);
+Object *i440fx = 
object_resolve_type_unambiguous(TYPE_I440FX_PCI_HOST_BRIDGE);
+Object *q35 = object_resolve_type_unambiguous(TYPE_Q35_HOST_DEVICE);
 CrsRangeEntry *entry;
 Aml *dsdt, *sb_scope, *scope, *dev, *method, *field, *pkg, *crs;
 CrsRangeSet crs_range_set;
@@ -1344,13 +1345,13 @@ build_dsdt(GArray *table_data, BIOSLinker *linker,
 AcpiTable table = { .sig = "DSDT", .rev = 1, .oem_id = x86ms->oem_id,
 .oem_table_id = x86ms->oem_table_id };
 
-assert(!!piix != !!lpc);
+assert(!!i440fx != !!q35);
 
 acpi_table_begin(, table_data);
 dsdt = init_aml_allocator();
 
 build_dbg_aml(dsdt);
-if (piix) {
+if (i440fx) {
 sb_scope = aml_scope("_SB");
 dev = aml_device("PCI0");
 aml_append(dev, aml_name_decl("_HID", aml_eisaid("PNP0A03")));
@@ -1363,7 +1364,7 @@ build_dsdt(GArray *table_data, BIOSLinker *linker,
 build_x86_acpi_pci_hotplug(dsdt, pm->pcihp_io_base);
 }
 build_piix4_pci0_int(dsdt);
-} else if (lpc) {
+} else if (q35) {
 sb_scope = aml_scope("_SB");
 dev = aml_device("PCI0");
 aml_append(dev, aml_name_decl("_HID", aml_eisaid("PNP0A08")));
-- 
MST

[PULL v2 61/82] acpi: pc/35: sanitize _GPE declaration order

2022-11-02 Thread Michael S. Tsirkin

From: Igor Mammedov 

Move _GPE block declaration before it gets referenced by other
hotplug handlers. While at it move PCI hotplug (_E01) handler
after PCI tree description to avoid forward reference to
to not yet declared methods/devices.

PS:
Forward 'usage' usualy is fine as long as it's hidden within
method, however 'iasl' may print warnings. So be nice
to iasl/guest OS and do things in proper order.

PS2: Also follow up patches will move some of hotplug code
from PCI tree to _E01 and that also requires PCI Device
nodes build first, before Scope can reuse that from
global context.

Signed-off-by: Igor Mammedov 
Message-Id: <20221017102146.2254096-11-imamm...@redhat.com>
Reviewed-by: Michael S. Tsirkin 
Signed-off-by: Michael S. Tsirkin 
---
 hw/i386/acpi-build.c | 47 +++-
 1 file changed, 25 insertions(+), 22 deletions(-)

diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c
index 916343d8d6..960305462c 100644
--- a/hw/i386/acpi-build.c
+++ b/hw/i386/acpi-build.c
@@ -1434,6 +1434,18 @@ build_dsdt(GArray *table_data, BIOSLinker *linker,
 aml_append(dsdt, sb_scope);
 }
 
+scope =  aml_scope("_GPE");
+{
+aml_append(scope, aml_name_decl("_HID", aml_string("ACPI0006")));
+if (machine->nvdimms_state->is_enabled) {
+method = aml_method("_E04", 0, AML_NOTSERIALIZED);
+aml_append(method, aml_notify(aml_name("\\_SB.NVDR"),
+  aml_int(0x80)));
+aml_append(scope, method);
+}
+}
+aml_append(dsdt, scope);
+
 if (pcmc->legacy_cpu_hotplug) {
 build_legacy_cpu_hotplug_aml(dsdt, machine, pm->cpu_hp_io_base);
 } else {
@@ -1452,28 +1464,6 @@ build_dsdt(GArray *table_data, BIOSLinker *linker,
  pcms->memhp_io_base);
 }
 
-scope =  aml_scope("_GPE");
-{
-aml_append(scope, aml_name_decl("_HID", aml_string("ACPI0006")));
-
-if (pm->pcihp_bridge_en || pm->pcihp_root_en) {
-method = aml_method("_E01", 0, AML_NOTSERIALIZED);
-aml_append(method,
-aml_acquire(aml_name("\\_SB.PCI0.BLCK"), 0x));
-aml_append(method, aml_call0("\\_SB.PCI0.PCNT"));
-aml_append(method, aml_release(aml_name("\\_SB.PCI0.BLCK")));
-aml_append(scope, method);
-}
-
-if (machine->nvdimms_state->is_enabled) {
-method = aml_method("_E04", 0, AML_NOTSERIALIZED);
-aml_append(method, aml_notify(aml_name("\\_SB.NVDR"),
-  aml_int(0x80)));
-aml_append(scope, method);
-}
-}
-aml_append(dsdt, scope);
-
 crs_range_set_init(_range_set);
 bus = PC_MACHINE(machine)->bus;
 if (bus) {
@@ -1752,6 +1742,19 @@ build_dsdt(GArray *table_data, BIOSLinker *linker,
 }
 aml_append(dsdt, sb_scope);
 
+if (pm->pcihp_bridge_en || pm->pcihp_root_en) {
+scope =  aml_scope("_GPE");
+{
+method = aml_method("_E01", 0, AML_NOTSERIALIZED);
+aml_append(method,
+aml_acquire(aml_name("\\_SB.PCI0.BLCK"), 0x));
+aml_append(method, aml_call0("\\_SB.PCI0.PCNT"));
+aml_append(method, aml_release(aml_name("\\_SB.PCI0.BLCK")));
+aml_append(scope, method);
+}
+aml_append(dsdt, scope);
+}
+
 /* copy AML table into ACPI tables blob and patch header there */
 g_array_append_vals(table_data, dsdt->buf->data, dsdt->buf->len);
 acpi_table_end(linker, );
-- 
MST

[PULL v2 71/82] tests: acpi: add and whitelist *.hmat-noinitiator expected blobs

2022-11-02 Thread Michael S. Tsirkin

From: Brice Goglin 

.. which will be used by follow up hmat-noinitiator test-case.

Signed-off-by: Brice Goglin 
Signed-off-by: Hesham Almatary 
Message-Id: <20221027100037.251-3-hesham.almat...@huawei.com>
Tested-by: Yicong Yang 
Reviewed-by: Michael S. Tsirkin 
Signed-off-by: Michael S. Tsirkin 
---
 tests/qtest/bios-tables-test-allowed-diff.h   | 4 
 tests/data/acpi/q35/APIC.acpihmat-noinitiator | 0
 tests/data/acpi/q35/DSDT.acpihmat-noinitiator | 0
 tests/data/acpi/q35/HMAT.acpihmat-noinitiator | 0
 tests/data/acpi/q35/SRAT.acpihmat-noinitiator | 0
 5 files changed, 4 insertions(+)
 create mode 100644 tests/data/acpi/q35/APIC.acpihmat-noinitiator
 create mode 100644 tests/data/acpi/q35/DSDT.acpihmat-noinitiator
 create mode 100644 tests/data/acpi/q35/HMAT.acpihmat-noinitiator
 create mode 100644 tests/data/acpi/q35/SRAT.acpihmat-noinitiator

diff --git a/tests/qtest/bios-tables-test-allowed-diff.h 
b/tests/qtest/bios-tables-test-allowed-diff.h
index dfb8523c8b..245fa66bcc 100644
--- a/tests/qtest/bios-tables-test-allowed-diff.h
+++ b/tests/qtest/bios-tables-test-allowed-diff.h
@@ -1 +1,5 @@
 /* List of comma-separated changed AML files to ignore */
+"tests/data/acpi/q35/APIC.acpihmat-noinitiator",
+"tests/data/acpi/q35/DSDT.acpihmat-noinitiator",
+"tests/data/acpi/q35/HMAT.acpihmat-noinitiator",
+"tests/data/acpi/q35/SRAT.acpihmat-noinitiator",
diff --git a/tests/data/acpi/q35/APIC.acpihmat-noinitiator 
b/tests/data/acpi/q35/APIC.acpihmat-noinitiator
new file mode 100644
index 00..e69de29bb2
diff --git a/tests/data/acpi/q35/DSDT.acpihmat-noinitiator 
b/tests/data/acpi/q35/DSDT.acpihmat-noinitiator
new file mode 100644
index 00..e69de29bb2
diff --git a/tests/data/acpi/q35/HMAT.acpihmat-noinitiator 
b/tests/data/acpi/q35/HMAT.acpihmat-noinitiator
new file mode 100644
index 00..e69de29bb2
diff --git a/tests/data/acpi/q35/SRAT.acpihmat-noinitiator 
b/tests/data/acpi/q35/SRAT.acpihmat-noinitiator
new file mode 100644
index 00..e69de29bb2
-- 
MST

Re: [PATCH] linux-user: always translate cmsg when recvmsg

2022-11-02 Thread Laurent Vivier


Le 28/10/2022 à 10:12, Icenowy Zheng a écrit :

It's possible that a message contains both normal payload and ancillary
data in the same message, and even if no ancillary data is available
this information should be passed to the target, otherwise the target
cmsghdr will be left uninitialized and the target is going to access
uninitialized memory if it expects cmsg.

Always call the function that translate cmsg when recvmsg, because that
function should be empty-cmsg-safe (it creates an empty cmsg in the
target).

Signed-off-by: Icenowy Zheng 
---
  linux-user/syscall.c | 3 ++-
  1 file changed, 2 insertions(+), 1 deletion(-)



Reviewed-by: Laurent Vivier

[PULL v2 14/82] acpi/tests/avocado/bits: add acpi and smbios avocado tests that uses biosbits

2022-11-02 Thread Michael S. Tsirkin

From: Ani Sinha 

This introduces QEMU acpi/smbios biosbits avocado test which is run
from within the python virtual environment. When the bits tests are run, bits
binaries are downloaded from an external repo/location, bios bits iso is
regenerated containing the acpi/smbios bits tests that are maintained as a part
of the QEMU source under tests/avocado/acpi-bits/bits-test . When the VM is
spawned with the iso, it runs the tests in batch mode and the results are pushed
out from the VM to the test machine where they are analyzed by this script and
pass/fail results are reported.

Cc: Daniel P. Berrangé 
Cc: Paolo Bonzini 
Cc: Maydell Peter 
Cc: John Snow 
Cc: Thomas Huth 
Cc: Alex Bennée 
Cc: Igor Mammedov 
Cc: Michael Tsirkin 
Signed-off-by: Ani Sinha 
Message-Id: <20221021095108.104843-6-...@anisinha.ca>
Reviewed-by: Michael S. Tsirkin 
Signed-off-by: Michael S. Tsirkin 
---
 tests/avocado/acpi-bits.py | 396 +
 1 file changed, 396 insertions(+)
 create mode 100644 tests/avocado/acpi-bits.py

diff --git a/tests/avocado/acpi-bits.py b/tests/avocado/acpi-bits.py
new file mode 100644
index 00..8745a58a76
--- /dev/null
+++ b/tests/avocado/acpi-bits.py
@@ -0,0 +1,396 @@
+#!/usr/bin/env python3
+# group: rw quick
+# Exercize QEMU generated ACPI/SMBIOS tables using biosbits,
+# https://biosbits.org/
+#
+# This program is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 2 of the License, or
+# (at your option) any later version.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program.  If not, see .
+#
+#
+# Author:
+#  Ani Sinha 
+
+# pylint: disable=invalid-name
+# pylint: disable=consider-using-f-string
+
+"""
+This is QEMU ACPI/SMBIOS avocado tests using biosbits.
+Biosbits is available originally at https://biosbits.org/.
+This test uses a fork of the upstream bits and has numerous fixes
+including an upgraded acpica. The fork is located here:
+https://gitlab.com/qemu-project/biosbits-bits .
+"""
+
+import logging
+import os
+import platform
+import re
+import shutil
+import subprocess
+import tarfile
+import tempfile
+import time
+import zipfile
+from typing import (
+List,
+Optional,
+Sequence,
+)
+from qemu.machine import QEMUMachine
+from avocado import skipIf
+from avocado_qemu import QemuBaseTest
+
+deps = ["xorriso"] # dependent tools needed in the test setup/box.
+supported_platforms = ['x86_64'] # supported test platforms.
+
+
+def which(tool):
+""" looks up the full path for @tool, returns None if not found
+or if @tool does not have executable permissions.
+"""
+paths=os.getenv('PATH')
+for p in paths.split(os.path.pathsep):
+p = os.path.join(p, tool)
+if os.path.exists(p) and os.access(p, os.X_OK):
+return p
+return None
+
+def missing_deps():
+""" returns True if any of the test dependent tools are absent.
+"""
+for dep in deps:
+if which(dep) is None:
+return True
+return False
+
+def supported_platform():
+""" checks if the test is running on a supported platform.
+"""
+return platform.machine() in supported_platforms
+
+class QEMUBitsMachine(QEMUMachine): # pylint: disable=too-few-public-methods
+"""
+A QEMU VM, with isa-debugcon enabled and bits iso passed
+using -cdrom to QEMU commandline.
+
+"""
+def __init__(self,
+ binary: str,
+ args: Sequence[str] = (),
+ wrapper: Sequence[str] = (),
+ name: Optional[str] = None,
+ base_temp_dir: str = "/var/tmp",
+ debugcon_log: str = "debugcon-log.txt",
+ debugcon_addr: str = "0x403",
+ sock_dir: Optional[str] = None,
+ qmp_timer: Optional[float] = None):
+# pylint: disable=too-many-arguments
+
+if name is None:
+name = "qemu-bits-%d" % os.getpid()
+if sock_dir is None:
+sock_dir = base_temp_dir
+super().__init__(binary, args, wrapper=wrapper, name=name,
+ base_temp_dir=base_temp_dir,
+ sock_dir=sock_dir, qmp_timer=qmp_timer)
+self.debugcon_log = debugcon_log
+self.debugcon_addr = debugcon_addr
+self.base_temp_dir = base_temp_dir
+
+@property
+def _base_args(self) -> List[str]:
+args = super()._base_args
+args.extend([
+'-chardev',
+'file,path=%s,id=debugcon' %os.path.join(self.base_temp_dir,
+

[PULL v2 64/82] MAINTAINERS: Add qapi/virtio.json to section "virtio"

2022-11-02 Thread Michael S. Tsirkin

From: Markus Armbruster 

Cc: Michael S. Tsirkin 
Signed-off-by: Markus Armbruster 
Message-Id: <20221020120458.80709-1-arm...@redhat.com>
Reviewed-by: Michael S. Tsirkin 
Signed-off-by: Michael S. Tsirkin 
Reviewed-by: Philippe Mathieu-Daudé 
---
 MAINTAINERS | 1 +
 1 file changed, 1 insertion(+)

diff --git a/MAINTAINERS b/MAINTAINERS
index 8b7d49b089..28cc70c25f 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -2017,6 +2017,7 @@ S: Supported
 F: hw/*/virtio*
 F: hw/virtio/Makefile.objs
 F: hw/virtio/trace-events
+F: qapi/virtio.json
 F: net/vhost-user.c
 F: include/hw/virtio/
 
-- 
MST

[PULL v2 54/82] acpi: pc/q35: drop ad-hoc PCI-ISA bridge AML routines and let bus ennumeration generate AML

2022-11-02 Thread Michael S. Tsirkin

From: Igor Mammedov 

PCI-ISA bridges that are built in PIIX/Q35 are building its own AML
using AcpiDevAmlIf interface. Now build_append_pci_bus_devices()
gained AcpiDevAmlIf interface support to get AML of devices atached
to PCI slots.
So drop ad-hoc build_q35_isa_bridge()/build_piix4_isa_bridge()
and let PCI bus enumeration to include PCI-ISA bridge AML
when it's enumerated by build_append_pci_bus_devices().

AML change is mostly contextual, which moves whole ISA hierarchy
directly under PCI host bridge instead of it being described
as separate \SB.PCI0.ISA block.

Note:
If bus/slot that hosts ISA bridge has BSEL set, it will gain new
ASUN and _DMS entries (i.e. acpi-index support, but it should not
cause any functional change and that is fine from PCI Firmware
spec point of view), potentially it's possible to suppress that
by adding a flag to PCIDevice but I don't see a reason to do that
yet, I'd rather treat bridge just as any other PCI device if it's
possible.

Signed-off-by: Igor Mammedov 
Message-Id: <20221017102146.2254096-4-imamm...@redhat.com>
Reviewed-by: Michael S. Tsirkin 
Signed-off-by: Michael S. Tsirkin 
---
 hw/i386/acpi-build.c | 75 
 hw/isa/lpc_ich9.c| 23 ++
 hw/isa/piix3.c   | 17 +-
 3 files changed, 39 insertions(+), 76 deletions(-)

diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c
index 26932b4e2c..e1483bb11a 100644
--- a/hw/i386/acpi-build.c
+++ b/hw/i386/acpi-build.c
@@ -435,10 +435,6 @@ static void build_append_pci_bus_devices(Aml 
*parent_scope, PCIBus *bus,
 pc = PCI_DEVICE_GET_CLASS(pdev);
 dc = DEVICE_GET_CLASS(pdev);
 
-if (pc->class_id == PCI_CLASS_BRIDGE_ISA) {
-continue;
-}
-
 /*
  * Cold plugged bridges aren't themselves hot-pluggable.
  * Hotplugged bridges *are* hot-pluggable.
@@ -1006,7 +1002,6 @@ static void build_piix4_pci0_int(Aml *table)
 {
 Aml *dev;
 Aml *crs;
-Aml *field;
 Aml *method;
 uint32_t irqs;
 Aml *sb_scope = aml_scope("_SB");
@@ -1015,13 +1010,6 @@ static void build_piix4_pci0_int(Aml *table)
 aml_append(pci0_scope, build_prt(true));
 aml_append(sb_scope, pci0_scope);
 
-field = aml_field("PCI0.ISA.P40C", AML_BYTE_ACC, AML_NOLOCK, AML_PRESERVE);
-aml_append(field, aml_named_field("PRQ0", 8));
-aml_append(field, aml_named_field("PRQ1", 8));
-aml_append(field, aml_named_field("PRQ2", 8));
-aml_append(field, aml_named_field("PRQ3", 8));
-aml_append(sb_scope, field);
-
 aml_append(sb_scope, build_irq_status_method());
 aml_append(sb_scope, build_iqcr_method(true));
 
@@ -1125,7 +1113,6 @@ static Aml *build_q35_routing_table(const char *str)
 
 static void build_q35_pci0_int(Aml *table)
 {
-Aml *field;
 Aml *method;
 Aml *sb_scope = aml_scope("_SB");
 Aml *pci0_scope = aml_scope("PCI0");
@@ -1162,18 +1149,6 @@ static void build_q35_pci0_int(Aml *table)
 aml_append(pci0_scope, method);
 aml_append(sb_scope, pci0_scope);
 
-field = aml_field("PCI0.ISA.PIRQ", AML_BYTE_ACC, AML_NOLOCK, AML_PRESERVE);
-aml_append(field, aml_named_field("PRQA", 8));
-aml_append(field, aml_named_field("PRQB", 8));
-aml_append(field, aml_named_field("PRQC", 8));
-aml_append(field, aml_named_field("PRQD", 8));
-aml_append(field, aml_reserved_field(0x20));
-aml_append(field, aml_named_field("PRQE", 8));
-aml_append(field, aml_named_field("PRQF", 8));
-aml_append(field, aml_named_field("PRQG", 8));
-aml_append(field, aml_named_field("PRQH", 8));
-aml_append(sb_scope, field);
-
 aml_append(sb_scope, build_irq_status_method());
 aml_append(sb_scope, build_iqcr_method(false));
 
@@ -1238,54 +1213,6 @@ static Aml *build_q35_dram_controller(const AcpiMcfgInfo 
*mcfg)
 return dev;
 }
 
-static void build_q35_isa_bridge(Aml *table)
-{
-Aml *dev;
-Aml *scope;
-Object *obj;
-bool ambiguous;
-
-/*
- * temporarily fish out isa bridge, build_q35_isa_bridge() will be dropped
- * once PCI is converted to AcpiDevAmlIf and would be ble to generate
- * AML for bridge itself
- */
-obj = object_resolve_path_type("", TYPE_ICH9_LPC_DEVICE, );
-assert(obj && !ambiguous);
-
-scope =  aml_scope("_SB.PCI0");
-dev = aml_device("ISA");
-aml_append(dev, aml_name_decl("_ADR", aml_int(0x001F)));
-
-call_dev_aml_func(DEVICE(obj), dev);
-aml_append(scope, dev);
-aml_append(table, scope);
-}
-
-static void build_piix4_isa_bridge(Aml *table)
-{
-Aml *dev;
-Aml *scope;
-Object *obj;
-bool ambiguous;
-
-/*
- * temporarily fish out isa bridge, build_piix4_isa_bridge() will be 
dropped
- * once PCI is converted to AcpiDevAmlIf and would be ble to generate
- * AML for bridge itself
- */
-obj = object_resolve_path_type("", TYPE_PIIX3_PCI_DEVICE, );
-assert(obj && !ambiguous);
-
-scope =

[PULL v2 36/82] virtio: introduce virtio_queue_reset()

2022-11-02 Thread Michael S. Tsirkin

From: Xuan Zhuo 

Introduce a new interface function virtio_queue_reset() to implement
reset for vq.

Add a new callback to VirtioDeviceClass for queue reset operation for
each child device.

Signed-off-by: Xuan Zhuo 
Acked-by: Jason Wang 
Message-Id: <20221017092558.111082-3-xuanz...@linux.alibaba.com>
Reviewed-by: Michael S. Tsirkin 
Signed-off-by: Michael S. Tsirkin 
---
 include/hw/virtio/virtio.h |  2 ++
 hw/virtio/virtio.c | 11 +++
 2 files changed, 13 insertions(+)

diff --git a/include/hw/virtio/virtio.h b/include/hw/virtio/virtio.h
index ebb58feaac..b4c237201d 100644
--- a/include/hw/virtio/virtio.h
+++ b/include/hw/virtio/virtio.h
@@ -148,6 +148,7 @@ struct VirtioDeviceClass {
 void (*set_config)(VirtIODevice *vdev, const uint8_t *config);
 void (*reset)(VirtIODevice *vdev);
 void (*set_status)(VirtIODevice *vdev, uint8_t val);
+void (*queue_reset)(VirtIODevice *vdev, uint32_t queue_index);
 /* For transitional devices, this is a bitmap of features
  * that are only exposed on the legacy interface but not
  * the modern one.
@@ -286,6 +287,7 @@ int virtio_queue_set_host_notifier_mr(VirtIODevice *vdev, 
int n,
   MemoryRegion *mr, bool assign);
 int virtio_set_status(VirtIODevice *vdev, uint8_t val);
 void virtio_reset(void *opaque);
+void virtio_queue_reset(VirtIODevice *vdev, uint32_t queue_index);
 void virtio_update_irq(VirtIODevice *vdev);
 int virtio_set_features(VirtIODevice *vdev, uint64_t val);
 
diff --git a/hw/virtio/virtio.c b/hw/virtio/virtio.c
index 6f42fcadd7..cf5f9ca387 100644
--- a/hw/virtio/virtio.c
+++ b/hw/virtio/virtio.c
@@ -2484,6 +2484,17 @@ static void __virtio_queue_reset(VirtIODevice *vdev, 
uint32_t i)
 virtio_virtqueue_reset_region_cache(>vq[i]);
 }
 
+void virtio_queue_reset(VirtIODevice *vdev, uint32_t queue_index)
+{
+VirtioDeviceClass *k = VIRTIO_DEVICE_GET_CLASS(vdev);
+
+if (k->queue_reset) {
+k->queue_reset(vdev, queue_index);
+}
+
+__virtio_queue_reset(vdev, queue_index);
+}
+
 void virtio_reset(void *opaque)
 {
 VirtIODevice *vdev = opaque;
-- 
MST

[PULL v2 65/82] msix: Assert that specified vector is in range

2022-11-02 Thread Michael S. Tsirkin

From: Akihiko Odaki 

There were several different ways to deal with the situation where the
vector specified for a msix function is out of bound:
- early return a function and keep progresssing
- propagate the error to the caller
- mark msix unusable
- assert it is in bound
- just ignore

An out-of-bound vector should not be specified if the device
implementation is correct so let msix functions always assert that the
specified vector is in range.

An exceptional case is virtio-pci, which allows the guest to configure
vectors. For virtio-pci, it is more appropriate to introduce its own
checks because it is sometimes too late to check the vector range in
msix functions.

Signed-off-by: Akihiko Odaki 
Message-Id: <20220829083524.143640-1-akihiko.od...@daynix.com>
Reviewed-by: Michael S. Tsirkin 
Signed-off-by: Michael S. Tsirkin 
Reviewed-by: Yuval Shaia 
Signed-off-by: Akihiko Odaki mailto:akihiko.od...@daynix.com; 
target="_blank">akihiko.od...@daynix.com
---
 include/hw/pci/msix.h |  4 +--
 hw/net/e1000e.c   | 15 ++---
 hw/net/rocker/rocker.c| 23 ++
 hw/net/vmxnet3.c  | 27 +++-
 hw/nvme/ctrl.c|  5 +--
 hw/pci/msix.c | 24 ++
 hw/rdma/vmw/pvrdma_main.c |  7 +---
 hw/remote/vfio-user-obj.c |  9 +-
 hw/virtio/virtio-pci.c| 67 ---
 9 files changed, 74 insertions(+), 107 deletions(-)

diff --git a/include/hw/pci/msix.h b/include/hw/pci/msix.h
index 4f1cda0ebe..0e6f257e45 100644
--- a/include/hw/pci/msix.h
+++ b/include/hw/pci/msix.h
@@ -33,10 +33,10 @@ bool msix_is_masked(PCIDevice *dev, unsigned vector);
 void msix_set_pending(PCIDevice *dev, unsigned vector);
 void msix_clr_pending(PCIDevice *dev, int vector);
 
-int msix_vector_use(PCIDevice *dev, unsigned vector);
+void msix_vector_use(PCIDevice *dev, unsigned vector);
 void msix_vector_unuse(PCIDevice *dev, unsigned vector);
 void msix_unuse_all_vectors(PCIDevice *dev);
-void msix_set_mask(PCIDevice *dev, int vector, bool mask, Error **errp);
+void msix_set_mask(PCIDevice *dev, int vector, bool mask);
 
 void msix_notify(PCIDevice *dev, unsigned vector);
 
diff --git a/hw/net/e1000e.c b/hw/net/e1000e.c
index ac96f7665a..7523e9f5d2 100644
--- a/hw/net/e1000e.c
+++ b/hw/net/e1000e.c
@@ -276,25 +276,18 @@ e1000e_unuse_msix_vectors(E1000EState *s, int num_vectors)
 }
 }
 
-static bool
+static void
 e1000e_use_msix_vectors(E1000EState *s, int num_vectors)
 {
 int i;
 for (i = 0; i < num_vectors; i++) {
-int res = msix_vector_use(PCI_DEVICE(s), i);
-if (res < 0) {
-trace_e1000e_msix_use_vector_fail(i, res);
-e1000e_unuse_msix_vectors(s, i);
-return false;
-}
+msix_vector_use(PCI_DEVICE(s), i);
 }
-return true;
 }
 
 static void
 e1000e_init_msix(E1000EState *s)
 {
-PCIDevice *d = PCI_DEVICE(s);
 int res = msix_init(PCI_DEVICE(s), E1000E_MSIX_VEC_NUM,
 >msix,
 E1000E_MSIX_IDX, E1000E_MSIX_TABLE,
@@ -305,9 +298,7 @@ e1000e_init_msix(E1000EState *s)
 if (res < 0) {
 trace_e1000e_msix_init_fail(res);
 } else {
-if (!e1000e_use_msix_vectors(s, E1000E_MSIX_VEC_NUM)) {
-msix_uninit(d, >msix, >msix);
-}
+e1000e_use_msix_vectors(s, E1000E_MSIX_VEC_NUM);
 }
 }
 
diff --git a/hw/net/rocker/rocker.c b/hw/net/rocker/rocker.c
index d8f3f16fe8..281d43e6cf 100644
--- a/hw/net/rocker/rocker.c
+++ b/hw/net/rocker/rocker.c
@@ -1212,24 +1212,14 @@ static void rocker_msix_vectors_unuse(Rocker *r,
 }
 }
 
-static int rocker_msix_vectors_use(Rocker *r,
-   unsigned int num_vectors)
+static void rocker_msix_vectors_use(Rocker *r, unsigned int num_vectors)
 {
 PCIDevice *dev = PCI_DEVICE(r);
-int err;
 int i;
 
 for (i = 0; i < num_vectors; i++) {
-err = msix_vector_use(dev, i);
-if (err) {
-goto rollback;
-}
+msix_vector_use(dev, i);
 }
-return 0;
-
-rollback:
-rocker_msix_vectors_unuse(r, i);
-return err;
 }
 
 static int rocker_msix_init(Rocker *r, Error **errp)
@@ -1247,16 +1237,9 @@ static int rocker_msix_init(Rocker *r, Error **errp)
 return err;
 }
 
-err = rocker_msix_vectors_use(r, ROCKER_MSIX_VEC_COUNT(r->fp_ports));
-if (err) {
-goto err_msix_vectors_use;
-}
+rocker_msix_vectors_use(r, ROCKER_MSIX_VEC_COUNT(r->fp_ports));
 
 return 0;
-
-err_msix_vectors_use:
-msix_uninit(dev, >msix_bar, >msix_bar);
-return err;
 }
 
 static void rocker_msix_uninit(Rocker *r)
diff --git a/hw/net/vmxnet3.c b/hw/net/vmxnet3.c
index 0b7acf7f89..d2ab527ef4 100644
--- a/hw/net/vmxnet3.c
+++ b/hw/net/vmxnet3.c
@@ -2110,20 +2110,14 @@ vmxnet3_unuse_msix_vectors(VMXNET3State *s, int 
num_vectors)
 }
 }
 
-static bool
+static void
 vmxnet3_use_msix_vectors(VMXNET3State *s, int num_vectors)
 {
 PCIDevice *d =

[PULL v2 35/82] virtio: introduce __virtio_queue_reset()

2022-11-02 Thread Michael S. Tsirkin

From: Xuan Zhuo 

Separate the logic of vq reset. This logic will be called directly
later.

Signed-off-by: Xuan Zhuo 
Acked-by: Jason Wang 
Message-Id: <20221017092558.111082-2-xuanz...@linux.alibaba.com>
Reviewed-by: Michael S. Tsirkin 
Signed-off-by: Michael S. Tsirkin 
---
 hw/virtio/virtio.c | 37 +
 1 file changed, 21 insertions(+), 16 deletions(-)

diff --git a/hw/virtio/virtio.c b/hw/virtio/virtio.c
index 808446b4c9..6f42fcadd7 100644
--- a/hw/virtio/virtio.c
+++ b/hw/virtio/virtio.c
@@ -2464,6 +2464,26 @@ static enum virtio_device_endian 
virtio_current_cpu_endian(void)
 }
 }
 
+static void __virtio_queue_reset(VirtIODevice *vdev, uint32_t i)
+{
+vdev->vq[i].vring.desc = 0;
+vdev->vq[i].vring.avail = 0;
+vdev->vq[i].vring.used = 0;
+vdev->vq[i].last_avail_idx = 0;
+vdev->vq[i].shadow_avail_idx = 0;
+vdev->vq[i].used_idx = 0;
+vdev->vq[i].last_avail_wrap_counter = true;
+vdev->vq[i].shadow_avail_wrap_counter = true;
+vdev->vq[i].used_wrap_counter = true;
+virtio_queue_set_vector(vdev, i, VIRTIO_NO_VECTOR);
+vdev->vq[i].signalled_used = 0;
+vdev->vq[i].signalled_used_valid = false;
+vdev->vq[i].notification = true;
+vdev->vq[i].vring.num = vdev->vq[i].vring.num_default;
+vdev->vq[i].inuse = 0;
+virtio_virtqueue_reset_region_cache(>vq[i]);
+}
+
 void virtio_reset(void *opaque)
 {
 VirtIODevice *vdev = opaque;
@@ -2495,22 +2515,7 @@ void virtio_reset(void *opaque)
 virtio_notify_vector(vdev, vdev->config_vector);
 
 for(i = 0; i < VIRTIO_QUEUE_MAX; i++) {
-vdev->vq[i].vring.desc = 0;
-vdev->vq[i].vring.avail = 0;
-vdev->vq[i].vring.used = 0;
-vdev->vq[i].last_avail_idx = 0;
-vdev->vq[i].shadow_avail_idx = 0;
-vdev->vq[i].used_idx = 0;
-vdev->vq[i].last_avail_wrap_counter = true;
-vdev->vq[i].shadow_avail_wrap_counter = true;
-vdev->vq[i].used_wrap_counter = true;
-virtio_queue_set_vector(vdev, i, VIRTIO_NO_VECTOR);
-vdev->vq[i].signalled_used = 0;
-vdev->vq[i].signalled_used_valid = false;
-vdev->vq[i].notification = true;
-vdev->vq[i].vring.num = vdev->vq[i].vring.num_default;
-vdev->vq[i].inuse = 0;
-virtio_virtqueue_reset_region_cache(>vq[i]);
+__virtio_queue_reset(vdev, i);
 }
 }
 
-- 
MST

[PULL v2 77/82] tests: virt: Update expected *.acpihmatvirt tables

2022-11-02 Thread Michael S. Tsirkin

From: Hesham Almatary 

* Expected ACPI Data Table [HMAT]
[000h    4]Signature : "HMAT"[Heterogeneous
Memory Attributes Table]
[004h 0004   4] Table Length : 0120
[008h 0008   1] Revision : 02
[009h 0009   1] Checksum : 4F
[00Ah 0010   6]   Oem ID : "BOCHS "
[010h 0016   8] Oem Table ID : "BXPC"
[018h 0024   4] Oem Revision : 0001
[01Ch 0028   4]  Asl Compiler ID : "BXPC"
[020h 0032   4]Asl Compiler Revision : 0001

[024h 0036   4] Reserved : 

[028h 0040   2]   Structure Type :  [Memory Proximity
Domain Attributes]
[02Ah 0042   2] Reserved : 
[02Ch 0044   4]   Length : 0028
[030h 0048   2]Flags (decoded below) : 0001
Processor Proximity Domain Valid : 1
[032h 0050   2]Reserved1 : 
[034h 0052   4]   Processor Proximity Domain : 
[038h 0056   4]  Memory Proximity Domain : 
[03Ch 0060   4]Reserved2 : 
[040h 0064   8]Reserved3 : 
[048h 0072   8]Reserved4 : 

[050h 0080   2]   Structure Type :  [Memory Proximity
Domain Attributes]
[052h 0082   2] Reserved : 
[054h 0084   4]   Length : 0028
[058h 0088   2]Flags (decoded below) : 0001
Processor Proximity Domain Valid : 1
[05Ah 0090   2]Reserved1 : 
[05Ch 0092   4]   Processor Proximity Domain : 0001
[060h 0096   4]  Memory Proximity Domain : 0001
[064h 0100   4]Reserved2 : 
[068h 0104   8]Reserved3 : 
[070h 0112   8]Reserved4 : 

[078h 0120   2]   Structure Type :  [Memory Proximity
Domain Attributes]
[07Ah 0122   2] Reserved : 
[07Ch 0124   4]   Length : 0028
[080h 0128   2]Flags (decoded below) : 
Processor Proximity Domain Valid : 0
[082h 0130   2]Reserved1 : 
[084h 0132   4]   Processor Proximity Domain : 0080
[088h 0136   4]  Memory Proximity Domain : 0002
[08Ch 0140   4]Reserved2 : 
[040h 0064   8]Reserved3 : 
[048h 0072   8]Reserved4 : 

[050h 0080   2]   Structure Type :  [Memory Proximity
Domain Attributes]
[052h 0082   2] Reserved : 
[054h 0084   4]   Length : 0028
[058h 0088   2]Flags (decoded below) : 0001
Processor Proximity Domain Valid : 1
[05Ah 0090   2]Reserved1 : 
[05Ch 0092   4]   Processor Proximity Domain : 0001
[060h 0096   4]  Memory Proximity Domain : 0001
[064h 0100   4]Reserved2 : 
[068h 0104   8]Reserved3 : 
[070h 0112   8]Reserved4 : 

[078h 0120   2]   Structure Type :  [Memory Proximity
Domain Attributes]
[07Ah 0122   2] Reserved : 
[07Ch 0124   4]   Length : 0028
[080h 0128   2]Flags (decoded below) : 
Processor Proximity Domain Valid : 0
[082h 0130   2]Reserved1 : 
[084h 0132   4]   Processor Proximity Domain : 0080
[088h 0136   4]  Memory Proximity Domain : 0002
[08Ch 0140   4]Reserved2 : 
[090h 0144   8]Reserved3 : 
[098h 0152   8]Reserved4 : 

[0A0h 0160   2]   Structure Type : 0001 [System Locality
Latency and Bandwidth Information]
[0A2h 0162   2] Reserved : 
[0A4h 0164   4]   Length : 0040
[0A8h 0168   1]Flags (decoded below) : 00
Memory Hierarchy : 0
[0A9h 0169   1]Data Type : 00
[0AAh 0170   2]Reserved1 : 
[0ACh 0172   4] Initiator Proximity Domains # : 0002
[0B0h 0176   4]   Target Proximity Domains # : 0003
[0B4h 0180   4]Reserved2 : 
[0B8h 0184   8]  Entry Base Unit : 2710
[0C0h 0192   4] Initiator Proximity Domain List : 
[0C4h 0196   4] Initiator Proximity Domain List : 0001
[0C8h 0200   4] Target Proximity Domain List : 
[0CCh 0204   4] Target Proximity Domain List : 0001
[0D0h 0208   4] Target Proximity Domain List : 0002
[0D4h 0212   2]Entry : 0001
[0D6h 0214   2]Entry : 0002
[0D8h 0216   2]Entry : 0003
[0DAh 0218   2]

[PULL v2 11/82] acpi/tests/avocado/bits: initial commit of test scripts that are run by biosbits

2022-11-02 Thread Michael S. Tsirkin

From: Ani Sinha 

This is initial commit of cpuid, acpi and smbios python test scripts for
biosbits to execute. No change has been made to them from the original code
written by the biosbits author Josh Triplett. They are required to be installed
into the bits iso file and then run from within the virtual machine booted off
with biosbits iso.

The test scripts have a ".py2" extension in order to prevent avocado from
loading them. They are written in python 2.7 and are run from within bios bits.
There is no need for avocado to try to load them and call out errors on python3
specific syntaxes.

The original location of these tests are here:
https://github.com/biosbits/bits/blob/master/python/testacpi.py
https://github.com/biosbits/bits/blob/master/python/smbios.py
https://github.com/biosbits/bits/blob/master/python/testcpuid.py

For QEMU, we maintain a fork of the above repo here with numerious fixes:
https://gitlab.com/qemu-project/biosbits-bits

The acpi test for example is maintained here in the fork:
https://gitlab.com/qemu-project/biosbits-bits/-/raw/master/python/testacpi.py

Cc: Daniel P. Berrangé 
Cc: Paolo Bonzini 
Cc: Maydell Peter 
Cc: John Snow 
Cc: Thomas Huth 
Cc: Alex Bennée 
Cc: Igor Mammedov 
Cc: Michael Tsirkin 
Signed-off-by: Ani Sinha 
Reviewed-by: Alex Bennée 
Message-Id: <20221021095108.104843-2-...@anisinha.ca>
Reviewed-by: Michael S. Tsirkin 
Signed-off-by: Michael S. Tsirkin 
---
 tests/avocado/acpi-bits/bits-tests/smbios.py2 | 2430 +
 .../avocado/acpi-bits/bits-tests/testacpi.py2 |  283 ++
 .../acpi-bits/bits-tests/testcpuid.py2|   83 +
 3 files changed, 2796 insertions(+)
 create mode 100644 tests/avocado/acpi-bits/bits-tests/smbios.py2
 create mode 100644 tests/avocado/acpi-bits/bits-tests/testacpi.py2
 create mode 100644 tests/avocado/acpi-bits/bits-tests/testcpuid.py2

diff --git a/tests/avocado/acpi-bits/bits-tests/smbios.py2 
b/tests/avocado/acpi-bits/bits-tests/smbios.py2
new file mode 100644
index 00..9667d0542c
--- /dev/null
+++ b/tests/avocado/acpi-bits/bits-tests/smbios.py2
@@ -0,0 +1,2430 @@
+# Copyright (c) 2015, Intel Corporation
+# All rights reserved.
+#
+# Redistribution and use in source and binary forms, with or without
+# modification, are permitted provided that the following conditions are met:
+#
+# * Redistributions of source code must retain the above copyright notice,
+#   this list of conditions and the following disclaimer.
+# * Redistributions in binary form must reproduce the above copyright 
notice,
+#   this list of conditions and the following disclaimer in the 
documentation
+#   and/or other materials provided with the distribution.
+# * Neither the name of Intel Corporation nor the names of its contributors
+#   may be used to endorse or promote products derived from this software
+#   without specific prior written permission.
+#
+# THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" 
AND
+# ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
+# WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
+# DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE 
FOR
+# ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL 
DAMAGES
+# (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
+# LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND 
ON
+# ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+# (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
+# SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
+"""SMBIOS/DMI module."""
+
+import bits
+import bitfields
+import ctypes
+import redirect
+import struct
+import uuid
+import unpack
+import ttypager
+import sys
+
+class SMBIOS(unpack.Struct):
+def __new__(cls):
+if sys.platform == "BITS-EFI":
+import efi
+sm_ptr = 
efi.system_table.ConfigurationTableDict.get(efi.SMBIOS_TABLE_GUID)
+else:
+address = 0xF
+mem = bits.memory(0xF, 0x1)
+for offset in range(0, len(mem), 16):
+signature = (ctypes.c_char * 4).from_address(address + 
offset).value
+if signature == "_SM_":
+entry_point_length = ctypes.c_ubyte.from_address(address + 
offset + 5).value
+csum = sum(map(ord, mem[offset:offset + 
entry_point_length])) & 0xff
+if csum == 0:
+sm_ptr = address + offset
+break
+else:
+return None
+
+if not sm_ptr:
+return None
+
+sm = super(SMBIOS, cls).__new__(cls)
+sm._header_memory = bits.memory(sm_ptr, 0x1f)
+return sm
+
+def __init__(self):
+super(SMBIOS, self).__init__()
+u = unpack.Unpackable(self._header_memory)
+

[PULL v2 30/82] hw/pci-bridge/cxl-upstream: Add a CDAT table access DOE

2022-11-02 Thread Michael S. Tsirkin

From: Jonathan Cameron 

This Data Object Exchange Mailbox allows software to query the
latency and bandwidth between ports on the switch. For now
only provide information on routes between the upstream port and
each downstream port (not p2p).

Signed-off-by: Jonathan Cameron 

--
Changes since v8: Mostly to match the type 3 equivalent
 - Move enum out of function and give it a more descriptive namespace.
Message-Id: <20221014151045.24781-6-jonathan.came...@huawei.com>
Reviewed-by: Michael S. Tsirkin 
Signed-off-by: Michael S. Tsirkin 
---
 include/hw/cxl/cxl_cdat.h|   1 +
 hw/pci-bridge/cxl_upstream.c | 195 ++-
 2 files changed, 195 insertions(+), 1 deletion(-)

diff --git a/include/hw/cxl/cxl_cdat.h b/include/hw/cxl/cxl_cdat.h
index 52c232e912..e9eda00142 100644
--- a/include/hw/cxl/cxl_cdat.h
+++ b/include/hw/cxl/cxl_cdat.h
@@ -131,6 +131,7 @@ typedef struct CDATSslbisHeader {
 uint64_t entry_base_unit;
 } QEMU_PACKED CDATSslbisHeader;
 
+#define CDAT_PORT_ID_USP 0x100
 /* Switch Scoped Latency and Bandwidth Entry - CDAT Table 10 */
 typedef struct CDATSslbe {
 uint16_t port_x_id;
diff --git a/hw/pci-bridge/cxl_upstream.c b/hw/pci-bridge/cxl_upstream.c
index a83a3e81e4..9b8b57df9d 100644
--- a/hw/pci-bridge/cxl_upstream.c
+++ b/hw/pci-bridge/cxl_upstream.c
@@ -10,11 +10,12 @@
 
 #include "qemu/osdep.h"
 #include "qemu/log.h"
+#include "hw/qdev-properties.h"
 #include "hw/pci/msi.h"
 #include "hw/pci/pcie.h"
 #include "hw/pci/pcie_port.h"
 
-#define CXL_UPSTREAM_PORT_MSI_NR_VECTOR 1
+#define CXL_UPSTREAM_PORT_MSI_NR_VECTOR 2
 
 #define CXL_UPSTREAM_PORT_MSI_OFFSET 0x70
 #define CXL_UPSTREAM_PORT_PCIE_CAP_OFFSET 0x90
@@ -28,6 +29,7 @@ typedef struct CXLUpstreamPort {
 
 /*< public >*/
 CXLComponentState cxl_cstate;
+DOECap doe_cdat;
 } CXLUpstreamPort;
 
 CXLComponentState *cxl_usp_to_cstate(CXLUpstreamPort *usp)
@@ -60,6 +62,9 @@ static void cxl_usp_dvsec_write_config(PCIDevice *dev, 
uint32_t addr,
 static void cxl_usp_write_config(PCIDevice *d, uint32_t address,
  uint32_t val, int len)
 {
+CXLUpstreamPort *usp = CXL_USP(d);
+
+pcie_doe_write_config(>doe_cdat, address, val, len);
 pci_bridge_write_config(d, address, val, len);
 pcie_cap_flr_write_config(d, address, val, len);
 pcie_aer_write_config(d, address, val, len);
@@ -67,6 +72,18 @@ static void cxl_usp_write_config(PCIDevice *d, uint32_t 
address,
 cxl_usp_dvsec_write_config(d, address, val, len);
 }
 
+static uint32_t cxl_usp_read_config(PCIDevice *d, uint32_t address, int len)
+{
+CXLUpstreamPort *usp = CXL_USP(d);
+uint32_t val;
+
+if (pcie_doe_read_config(>doe_cdat, address, len, )) {
+return val;
+}
+
+return pci_default_read_config(d, address, len);
+}
+
 static void latch_registers(CXLUpstreamPort *usp)
 {
 uint32_t *reg_state = usp->cxl_cstate.crb.cache_mem_registers;
@@ -119,6 +136,167 @@ static void build_dvsecs(CXLComponentState *cxl)
REG_LOC_DVSEC_REVID, dvsec);
 }
 
+static bool cxl_doe_cdat_rsp(DOECap *doe_cap)
+{
+CDATObject *cdat = _USP(doe_cap->pdev)->cxl_cstate.cdat;
+uint16_t ent;
+void *base;
+uint32_t len;
+CDATReq *req = pcie_doe_get_write_mbox_ptr(doe_cap);
+CDATRsp rsp;
+
+cxl_doe_cdat_update(_USP(doe_cap->pdev)->cxl_cstate, _fatal);
+assert(cdat->entry_len);
+
+/* Discard if request length mismatched */
+if (pcie_doe_get_obj_len(req) <
+DIV_ROUND_UP(sizeof(CDATReq), sizeof(uint32_t))) {
+return false;
+}
+
+ent = req->entry_handle;
+base = cdat->entry[ent].base;
+len = cdat->entry[ent].length;
+
+rsp = (CDATRsp) {
+.header = {
+.vendor_id = CXL_VENDOR_ID,
+.data_obj_type = CXL_DOE_TABLE_ACCESS,
+.reserved = 0x0,
+.length = DIV_ROUND_UP((sizeof(rsp) + len), sizeof(uint32_t)),
+},
+.rsp_code = CXL_DOE_TAB_RSP,
+.table_type = CXL_DOE_TAB_TYPE_CDAT,
+.entry_handle = (ent < cdat->entry_len - 1) ?
+ent + 1 : CXL_DOE_TAB_ENT_MAX,
+};
+
+memcpy(doe_cap->read_mbox, , sizeof(rsp));
+memcpy(doe_cap->read_mbox + DIV_ROUND_UP(sizeof(rsp), 
sizeof(uint32_t)),
+   base, len);
+
+doe_cap->read_mbox_len += rsp.header.length;
+
+return true;
+}
+
+static DOEProtocol doe_cdat_prot[] = {
+{ CXL_VENDOR_ID, CXL_DOE_TABLE_ACCESS, cxl_doe_cdat_rsp },
+{ }
+};
+
+enum {
+CXL_USP_CDAT_SSLBIS_LAT,
+CXL_USP_CDAT_SSLBIS_BW,
+CXL_USP_CDAT_NUM_ENTRIES
+};
+
+static int build_cdat_table(CDATSubHeader ***cdat_table, void *priv)
+{
+g_autofree CDATSslbis *sslbis_latency = NULL;
+g_autofree CDATSslbis *sslbis_bandwidth = NULL;
+CXLUpstreamPort *us = CXL_USP(priv);
+PCIBus *bus = _BRIDGE(us)->sec_bus;
+int devfn, sslbis_size, i;
+int count = 0;
+uint16_t port_ids[256];
+
+for (devfn = 0; devfn <

[PULL v2 75/82] hw/arm/virt: Enable HMAT on arm virt machine

2022-11-02 Thread Michael S. Tsirkin

From: Xiang Chen 

Since the patchset ("Build ACPI Heterogeneous Memory Attribute Table (HMAT)"),
HMAT is supported, but only x86 is enabled. Enable HMAT on arm virt machine.

Signed-off-by: Xiang Chen 
Signed-off-by: Hesham Almatary 
Reviewed-by: Igor Mammedov 
Message-Id: <20221027100037.251-7-hesham.almat...@huawei.com>
Tested-by: Yicong Yang 
Reviewed-by: Michael S. Tsirkin 
Signed-off-by: Michael S. Tsirkin 
---
 hw/arm/virt-acpi-build.c | 7 +++
 hw/arm/Kconfig   | 1 +
 2 files changed, 8 insertions(+)

diff --git a/hw/arm/virt-acpi-build.c b/hw/arm/virt-acpi-build.c
index da9e41e72b..4156111d49 100644
--- a/hw/arm/virt-acpi-build.c
+++ b/hw/arm/virt-acpi-build.c
@@ -42,6 +42,7 @@
 #include "hw/acpi/memory_hotplug.h"
 #include "hw/acpi/generic_event_device.h"
 #include "hw/acpi/tpm.h"
+#include "hw/acpi/hmat.h"
 #include "hw/pci/pcie_host.h"
 #include "hw/pci/pci.h"
 #include "hw/pci/pci_bus.h"
@@ -987,6 +988,12 @@ void virt_acpi_build(VirtMachineState *vms, 
AcpiBuildTables *tables)
 build_slit(tables_blob, tables->linker, ms, vms->oem_id,
vms->oem_table_id);
 }
+
+if (ms->numa_state->hmat_enabled) {
+acpi_add_table(table_offsets, tables_blob);
+build_hmat(tables_blob, tables->linker, ms->numa_state,
+   vms->oem_id, vms->oem_table_id);
+}
 }
 
 if (ms->nvdimms_state->is_enabled) {
diff --git a/hw/arm/Kconfig b/hw/arm/Kconfig
index 15fa79afd3..17fcde8e1c 100644
--- a/hw/arm/Kconfig
+++ b/hw/arm/Kconfig
@@ -30,6 +30,7 @@ config ARM_VIRT
 select ACPI_VIOT
 select VIRTIO_MEM_SUPPORTED
 select ACPI_CXL
+select ACPI_HMAT
 
 config CHEETAH
 bool
-- 
MST

[PULL v2 73/82] tests: acpi: q35: update expected blobs *.hmat-noinitiators expected HMAT:

2022-11-02 Thread Michael S. Tsirkin

From: Brice Goglin 

[000h    4]Signature : "HMAT"[Heterogeneous Memory 
Attributes Table]
[004h 0004   4] Table Length : 0120
[008h 0008   1] Revision : 02
[009h 0009   1] Checksum : 4F
[00Ah 0010   6]   Oem ID : "BOCHS "
[010h 0016   8] Oem Table ID : "BXPC"
[018h 0024   4] Oem Revision : 0001
[01Ch 0028   4]  Asl Compiler ID : "BXPC"
[020h 0032   4]Asl Compiler Revision : 0001

[024h 0036   4] Reserved : 

[028h 0040   2]   Structure Type :  [Memory Proximity Domain 
Attributes]
[02Ah 0042   2] Reserved : 
[02Ch 0044   4]   Length : 0028
[030h 0048   2]Flags (decoded below) : 0001
Processor Proximity Domain Valid : 1
[032h 0050   2]Reserved1 : 
[034h 0052   4] Attached Initiator Proximity Domain : 
[038h 0056   4]  Memory Proximity Domain : 
[03Ch 0060   4]Reserved2 : 
[040h 0064   8]Reserved3 : 
[048h 0072   8]Reserved4 : 

[050h 0080   2]   Structure Type :  [Memory Proximity Domain 
Attributes]
[052h 0082   2] Reserved : 
[054h 0084   4]   Length : 0028
[058h 0088   2]Flags (decoded below) : 0001
Processor Proximity Domain Valid : 1
[05Ah 0090   2]Reserved1 : 
[05Ch 0092   4] Attached Initiator Proximity Domain : 0001
[060h 0096   4]  Memory Proximity Domain : 0001
[064h 0100   4]Reserved2 : 
[068h 0104   8]Reserved3 : 
[070h 0112   8]Reserved4 : 

[078h 0120   2]   Structure Type :  [Memory Proximity Domain 
Attributes]
[07Ah 0122   2] Reserved : 
[07Ch 0124   4]   Length : 0028
[080h 0128   2]Flags (decoded below) : 
Processor Proximity Domain Valid : 0
[082h 0130   2]Reserved1 : 
[084h 0132   4] Attached Initiator Proximity Domain : 0080
[088h 0136   4]  Memory Proximity Domain : 0002
[08Ch 0140   4]Reserved2 : 
[090h 0144   8]Reserved3 : 
[098h 0152   8]Reserved4 : 

[0A0h 0160   2]   Structure Type : 0001 [System Locality Latency 
and Bandwidth Information]
[0A2h 0162   2] Reserved : 
[0A4h 0164   4]   Length : 0040
[0A8h 0168   1]Flags (decoded below) : 00
Memory Hierarchy : 0
[0A9h 0169   1]Data Type : 00
[0AAh 0170   2]Reserved1 : 
[0ACh 0172   4] Initiator Proximity Domains # : 0002
[0B0h 0176   4]   Target Proximity Domains # : 0003
[0B4h 0180   4]Reserved2 : 
[0B8h 0184   8]  Entry Base Unit : 2710
[0C0h 0192   4] Initiator Proximity Domain List : 
[0C4h 0196   4] Initiator Proximity Domain List : 0001
[0C8h 0200   4] Target Proximity Domain List : 
[0CCh 0204   4] Target Proximity Domain List : 0001
[0D0h 0208   4] Target Proximity Domain List : 0002
[0D4h 0212   2]Entry : 0001
[0D6h 0214   2]Entry : 0002
[0D8h 0216   2]Entry : 0003
[0DAh 0218   2]Entry : 0002
[0DCh 0220   2]Entry : 0001
[0DEh 0222   2]Entry : 0003

[0E0h 0224   2]   Structure Type : 0001 [System Locality Latency 
and Bandwidth Information]
[0E2h 0226   2] Reserved : 
[0E4h 0228   4]   Length : 0040
[0E8h 0232   1]Flags (decoded below) : 00
Memory Hierarchy : 0
[0E9h 0233   1]Data Type : 03
[0EAh 0234   2]Reserved1 : 
[0ECh 0236   4] Initiator Proximity Domains # : 0002
[0F0h 0240   4]   Target Proximity Domains # : 0003
[0F4h 0244   4]Reserved2 : 
[0F8h 0248   8]  Entry Base Unit : 0001
[100h 0256   4] Initiator Proximity Domain List : 
[104h 0260   4] Initiator Proximity Domain List : 0001
[108h 0264   4] Target Proximity Domain List : 
[10Ch 0268   4] Target Proximity Domain List : 0001
[110h 0272   4] Target Proximity Domain List : 0002
[114h 0276   2]Entry : 000A
[116h 0278   2]Entry : 0005
[118h 0280   2]Entry : 0001
[11Ah 0282   2]Entry : 0005
[11Ch 0284   2]

[PULL v2 81/82] intel-iommu: convert VTD_PE_GET_FPD_ERR() to be a function

2022-11-02 Thread Michael S. Tsirkin

From: Jason Wang 

We used to have a macro for VTD_PE_GET_FPD_ERR() but it has an
internal goto which prevents it from being reused. This patch convert
that macro to a dedicated function and let the caller to decide what
to do (e.g using goto or not). This makes sure it can be re-used for
other function that requires fault reporting.

Reviewed-by: Peter Xu 
Signed-off-by: Jason Wang 
Message-Id: <20221028061436.30093-4-jasow...@redhat.com>
Reviewed-by: Michael S. Tsirkin 
Signed-off-by: Michael S. Tsirkin 
Reviewed-by: Yi Liu 
---
 hw/i386/intel_iommu.c | 42 --
 1 file changed, 28 insertions(+), 14 deletions(-)

diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c
index 9fe5a222eb..9029ee98f4 100644
--- a/hw/i386/intel_iommu.c
+++ b/hw/i386/intel_iommu.c
@@ -49,17 +49,6 @@
 /* pe operations */
 #define VTD_PE_GET_TYPE(pe) ((pe)->val[0] & VTD_SM_PASID_ENTRY_PGTT)
 #define VTD_PE_GET_LEVEL(pe) (2 + (((pe)->val[0] >> 2) & 
VTD_SM_PASID_ENTRY_AW))
-#define VTD_PE_GET_FPD_ERR(ret_fr, is_fpd_set, s, source_id, addr, is_write) {\
-if (ret_fr) { \
-ret_fr = -ret_fr; \
-if (is_fpd_set && vtd_is_qualified_fault(ret_fr)) {   \
-trace_vtd_fault_disabled();   \
-} else {  \
-vtd_report_dmar_fault(s, source_id, addr, ret_fr, is_write);  \
-} \
-goto error;   \
-} \
-}
 
 /*
  * PCI bus number (or SID) is not reliable since the device is usaully
@@ -1716,6 +1705,19 @@ out:
 trace_vtd_pt_enable_fast_path(source_id, success);
 }
 
+static void vtd_report_fault(IntelIOMMUState *s,
+ int err, bool is_fpd_set,
+ uint16_t source_id,
+ hwaddr addr,
+ bool is_write)
+{
+if (is_fpd_set && vtd_is_qualified_fault(err)) {
+trace_vtd_fault_disabled();
+} else {
+vtd_report_dmar_fault(s, source_id, addr, err, is_write);
+}
+}
+
 /* Map dev to context-entry then do a paging-structures walk to do a iommu
  * translation.
  *
@@ -1776,7 +1778,11 @@ static bool vtd_do_iommu_translate(VTDAddressSpace 
*vtd_as, PCIBus *bus,
 is_fpd_set = ce.lo & VTD_CONTEXT_ENTRY_FPD;
 if (!is_fpd_set && s->root_scalable) {
 ret_fr = vtd_ce_get_pasid_fpd(s, , _fpd_set);
-VTD_PE_GET_FPD_ERR(ret_fr, is_fpd_set, s, source_id, addr, 
is_write);
+if (ret_fr) {
+vtd_report_fault(s, -ret_fr, is_fpd_set,
+ source_id, addr, is_write);
+goto error;
+}
 }
 } else {
 ret_fr = vtd_dev_to_context_entry(s, bus_num, devfn, );
@@ -1784,7 +1790,11 @@ static bool vtd_do_iommu_translate(VTDAddressSpace 
*vtd_as, PCIBus *bus,
 if (!ret_fr && !is_fpd_set && s->root_scalable) {
 ret_fr = vtd_ce_get_pasid_fpd(s, , _fpd_set);
 }
-VTD_PE_GET_FPD_ERR(ret_fr, is_fpd_set, s, source_id, addr, is_write);
+if (ret_fr) {
+vtd_report_fault(s, -ret_fr, is_fpd_set,
+ source_id, addr, is_write);
+goto error;
+}
 /* Update context-cache */
 trace_vtd_iotlb_cc_update(bus_num, devfn, ce.hi, ce.lo,
   cc_entry->context_cache_gen,
@@ -1820,7 +1830,11 @@ static bool vtd_do_iommu_translate(VTDAddressSpace 
*vtd_as, PCIBus *bus,
 
 ret_fr = vtd_iova_to_slpte(s, , addr, is_write, , ,
, , s->aw_bits);
-VTD_PE_GET_FPD_ERR(ret_fr, is_fpd_set, s, source_id, addr, is_write);
+if (ret_fr) {
+vtd_report_fault(s, -ret_fr, is_fpd_set, source_id,
+ addr, is_write);
+goto error;
+}
 
 page_mask = vtd_slpt_level_page_mask(level);
 access_flags = IOMMU_ACCESS_FLAG(reads, writes);
-- 
MST

[PULL v2 68/82] hw/i386/acpi-build: Resolve redundant attribute

2022-11-02 Thread Michael S. Tsirkin

From: Bernhard Beschow 

The is_piix4 attribute is set once in one location and read once in
another. Doing both in one location allows for removing the attribute
altogether.

Signed-off-by: Bernhard Beschow 
Reviewed-by: Philippe Mathieu-Daudé 
Message-Id: <20221026133110.91828-3-shen...@gmail.com>
Message-Id: <20221028103419.93398-3-shen...@gmail.com>
Reviewed-by: Michael S. Tsirkin 
Signed-off-by: Michael S. Tsirkin 
---
 hw/i386/acpi-build.c | 20 ++--
 1 file changed, 6 insertions(+), 14 deletions(-)

diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c
index 1ebf14b899..73d8a59737 100644
--- a/hw/i386/acpi-build.c
+++ b/hw/i386/acpi-build.c
@@ -112,7 +112,6 @@ typedef struct AcpiPmInfo {
 } AcpiPmInfo;
 
 typedef struct AcpiMiscInfo {
-bool is_piix4;
 bool has_hpet;
 #ifdef CONFIG_TPM
 TPMVersion tpm_version;
@@ -281,17 +280,6 @@ static void acpi_get_pm_info(MachineState *machine, 
AcpiPmInfo *pm)
 
 static void acpi_get_misc_info(AcpiMiscInfo *info)
 {
-Object *piix = object_resolve_type_unambiguous(TYPE_PIIX4_PM);
-Object *lpc = object_resolve_type_unambiguous(TYPE_ICH9_LPC_DEVICE);
-assert(!!piix != !!lpc);
-
-if (piix) {
-info->is_piix4 = true;
-}
-if (lpc) {
-info->is_piix4 = false;
-}
-
 info->has_hpet = hpet_find();
 #ifdef CONFIG_TPM
 info->tpm_version = tpm_get_version(tpm_find());
@@ -1334,6 +1322,8 @@ build_dsdt(GArray *table_data, BIOSLinker *linker,
AcpiPmInfo *pm, AcpiMiscInfo *misc,
Range *pci_hole, Range *pci_hole64, MachineState *machine)
 {
+Object *piix = object_resolve_type_unambiguous(TYPE_PIIX4_PM);
+Object *lpc = object_resolve_type_unambiguous(TYPE_ICH9_LPC_DEVICE);
 CrsRangeEntry *entry;
 Aml *dsdt, *sb_scope, *scope, *dev, *method, *field, *pkg, *crs;
 CrsRangeSet crs_range_set;
@@ -1354,11 +1344,13 @@ build_dsdt(GArray *table_data, BIOSLinker *linker,
 AcpiTable table = { .sig = "DSDT", .rev = 1, .oem_id = x86ms->oem_id,
 .oem_table_id = x86ms->oem_table_id };
 
+assert(!!piix != !!lpc);
+
 acpi_table_begin(, table_data);
 dsdt = init_aml_allocator();
 
 build_dbg_aml(dsdt);
-if (misc->is_piix4) {
+if (piix) {
 sb_scope = aml_scope("_SB");
 dev = aml_device("PCI0");
 aml_append(dev, aml_name_decl("_HID", aml_eisaid("PNP0A03")));
@@ -1371,7 +1363,7 @@ build_dsdt(GArray *table_data, BIOSLinker *linker,
 build_x86_acpi_pci_hotplug(dsdt, pm->pcihp_io_base);
 }
 build_piix4_pci0_int(dsdt);
-} else {
+} else if (lpc) {
 sb_scope = aml_scope("_SB");
 dev = aml_device("PCI0");
 aml_append(dev, aml_name_decl("_HID", aml_eisaid("PNP0A08")));
-- 
MST

[PULL v2 17/82] hw/smbios: add core_count2 to smbios table type 4

2022-11-02 Thread Michael S. Tsirkin

From: Julia Suvorova 

In order to use the increased number of cpus, we need to bring smbios
tables in line with the SMBIOS 3.0 specification. This allows us to
introduce core_count2 which acts as a duplicate of core_count if we have
fewer cores than 256, and contains the actual core number per socket if
we have more.

core_enabled2 and thread_count2 fields work the same way.

Signed-off-by: Julia Suvorova 
Reviewed-by: Igor Mammedov 
Message-Id: <20220731162141.178443-2-jus...@redhat.com>
Message-Id: <2022101731.101412-2-jus...@redhat.com>
Reviewed-by: Michael S. Tsirkin 
Signed-off-by: Michael S. Tsirkin 
---
 hw/smbios/smbios_build.h |  9 +++--
 include/hw/firmware/smbios.h | 12 
 hw/smbios/smbios.c   | 19 ---
 3 files changed, 35 insertions(+), 5 deletions(-)

diff --git a/hw/smbios/smbios_build.h b/hw/smbios/smbios_build.h
index 56b5a1e3f3..351660024e 100644
--- a/hw/smbios/smbios_build.h
+++ b/hw/smbios/smbios_build.h
@@ -27,6 +27,11 @@ extern unsigned smbios_table_max;
 extern unsigned smbios_table_cnt;
 
 #define SMBIOS_BUILD_TABLE_PRE(tbl_type, tbl_handle, tbl_required)\
+SMBIOS_BUILD_TABLE_PRE_SIZE(tbl_type, tbl_handle, tbl_required,   \
+sizeof(struct smbios_type_##tbl_type))\
+
+#define SMBIOS_BUILD_TABLE_PRE_SIZE(tbl_type, tbl_handle, \
+tbl_required, tbl_len)\
 struct smbios_type_##tbl_type *t; \
 size_t t_off; /* table offset into smbios_tables */   \
 int str_index = 0;\
@@ -39,12 +44,12 @@ extern unsigned smbios_table_cnt;
 /* use offset of table t within smbios_tables */  \
 /* (pointer must be updated after each realloc) */\
 t_off = smbios_tables_len;\
-smbios_tables_len += sizeof(*t);  \
+smbios_tables_len += tbl_len; \
 smbios_tables = g_realloc(smbios_tables, smbios_tables_len);  \
 t = (struct smbios_type_##tbl_type *)(smbios_tables + t_off); \
   \
 t->header.type = tbl_type;\
-t->header.length = sizeof(*t);\
+t->header.length = tbl_len;   \
 t->header.handle = cpu_to_le16(tbl_handle);   \
 } while (0)
 
diff --git a/include/hw/firmware/smbios.h b/include/hw/firmware/smbios.h
index e7d386f7c8..7f3259a630 100644
--- a/include/hw/firmware/smbios.h
+++ b/include/hw/firmware/smbios.h
@@ -18,6 +18,8 @@
 
 
 #define SMBIOS_MAX_TYPE 127
+#define offsetofend(TYPE, MEMBER) \
+   (offsetof(TYPE, MEMBER) + sizeof_field(TYPE, MEMBER))
 
 /* memory area description, used by type 19 table */
 struct smbios_phys_mem_area {
@@ -187,8 +189,18 @@ struct smbios_type_4 {
 uint8_t thread_count;
 uint16_t processor_characteristics;
 uint16_t processor_family2;
+/* SMBIOS spec 3.0.0, Table 21 */
+uint16_t core_count2;
+uint16_t core_enabled2;
+uint16_t thread_count2;
 } QEMU_PACKED;
 
+typedef enum smbios_type_4_len_ver {
+SMBIOS_TYPE_4_LEN_V28 = offsetofend(struct smbios_type_4,
+processor_family2),
+SMBIOS_TYPE_4_LEN_V30 = offsetofend(struct smbios_type_4, thread_count2),
+} smbios_type_4_len_ver;
+
 /* SMBIOS type 8 - Port Connector Information */
 struct smbios_type_8 {
 struct smbios_structure_header header;
diff --git a/hw/smbios/smbios.c b/hw/smbios/smbios.c
index 51437ca09f..b4243de735 100644
--- a/hw/smbios/smbios.c
+++ b/hw/smbios/smbios.c
@@ -711,8 +711,14 @@ static void smbios_build_type_3_table(void)
 static void smbios_build_type_4_table(MachineState *ms, unsigned instance)
 {
 char sock_str[128];
+size_t tbl_len = SMBIOS_TYPE_4_LEN_V28;
 
-SMBIOS_BUILD_TABLE_PRE(4, T4_BASE + instance, true); /* required */
+if (smbios_ep_type == SMBIOS_ENTRY_POINT_TYPE_64) {
+tbl_len = SMBIOS_TYPE_4_LEN_V30;
+}
+
+SMBIOS_BUILD_TABLE_PRE_SIZE(4, T4_BASE + instance,
+true, tbl_len); /* required */
 
 snprintf(sock_str, sizeof(sock_str), "%s%2x", type4.sock_pfx, instance);
 SMBIOS_TABLE_SET_STR(4, socket_designation_str, sock_str);
@@ -739,8 +745,15 @@ static void smbios_build_type_4_table(MachineState *ms, 
unsigned instance)
 SMBIOS_TABLE_SET_STR(4, serial_number_str, type4.serial);
 SMBIOS_TABLE_SET_STR(4, asset_tag_number_str, type4.asset);
 SMBIOS_TABLE_SET_STR(4, part_number_str, type4.part);
-t->core_count = t->core_enabled = ms->smp.cores;
-t->thread_count = ms->smp.threads;
+
+t->core_count = (ms->smp.cores >

[PULL v2 62/82] tests: acpi: update expected blobs

2022-11-02 Thread Michael S. Tsirkin

From: Igor Mammedov 

Expected changes are:
 1) Moving _GPE scope declaration achec of all _E0x methods
   +Scope (_GPE)
   +{
   +Name (_HID, "ACPI0006" /* GPE Block Device */)  // _HID: Hardware ID
   +}
   +
Scope (_SB)
{
Device (\_SB.PCI0.PRES)

\_SB.CPUS.CSCN ()
}

   -Scope (_GPE)
   -{
   -Name (_HID, "ACPI0006" /* GPE Block Device */)  // _HID: Hardware ID
   -}

 2) Moving _E01 handler after PCI0 scope is defined
-Scope (_GPE)
-{
-Name (_HID, "ACPI0006" /* GPE Block Device */)  // _HID: Hardware 
ID
-Method (_E01, 0, NotSerialized)  // _Exx: Edge-Triggered GPE
-{
-Acquire (\_SB.PCI0.BLCK, 0x)
-\_SB.PCI0.PCNT ()
-Release (\_SB.PCI0.BLCK)
-}
-}
-
 Scope (\_SB.PCI0)
 {
 Name (_CRS, ResourceTemplate ()  // _CRS: Current Resource Settings
=
 }
 }
 }
+
+Scope (_GPE)
+{
+Method (_E01, 0, NotSerialized)  // _Exx: Edge-Triggered GPE
+{
+Acquire (\_SB.PCI0.BLCK, 0x)
+\_SB.PCI0.PCNT ()
+Release (\_SB.PCI0.BLCK)
+}
+}
 }

Signed-off-by: Igor Mammedov 
Message-Id: <20221017102146.2254096-12-imamm...@redhat.com>
---
 tests/qtest/bios-tables-test-allowed-diff.h |  34 
 tests/data/acpi/pc/DSDT | Bin 6496 -> 6501 bytes
 tests/data/acpi/pc/DSDT.acpierst| Bin 6456 -> 6461 bytes
 tests/data/acpi/pc/DSDT.acpihmat| Bin 7821 -> 7826 bytes
 tests/data/acpi/pc/DSDT.bridge  | Bin 9570 -> 9575 bytes
 tests/data/acpi/pc/DSDT.cphp| Bin 6960 -> 6965 bytes
 tests/data/acpi/pc/DSDT.dimmpxm | Bin 8150 -> 8155 bytes
 tests/data/acpi/pc/DSDT.hpbridge| Bin 6456 -> 6461 bytes
 tests/data/acpi/pc/DSDT.hpbrroot| Bin 3107 -> 3107 bytes
 tests/data/acpi/pc/DSDT.ipmikcs | Bin 6568 -> 6573 bytes
 tests/data/acpi/pc/DSDT.memhp   | Bin 7855 -> 7860 bytes
 tests/data/acpi/pc/DSDT.nohpet  | Bin 6354 -> 6359 bytes
 tests/data/acpi/pc/DSDT.numamem | Bin 6502 -> 6507 bytes
 tests/data/acpi/pc/DSDT.roothp  | Bin 6694 -> 6699 bytes
 tests/data/acpi/q35/DSDT| Bin 8407 -> 8412 bytes
 tests/data/acpi/q35/DSDT.acpierst   | Bin 8424 -> 8429 bytes
 tests/data/acpi/q35/DSDT.acpihmat   | Bin 9732 -> 9737 bytes
 tests/data/acpi/q35/DSDT.applesmc   | Bin 8453 -> 8458 bytes
 tests/data/acpi/q35/DSDT.bridge | Bin 11536 -> 11541 bytes
 tests/data/acpi/q35/DSDT.cphp   | Bin 8871 -> 8876 bytes
 tests/data/acpi/q35/DSDT.cxl| Bin 9733 -> 9738 bytes
 tests/data/acpi/q35/DSDT.dimmpxm| Bin 10061 -> 10066 bytes
 tests/data/acpi/q35/DSDT.ipmibt | Bin 8482 -> 8487 bytes
 tests/data/acpi/q35/DSDT.ipmismbus  | Bin 8495 -> 8500 bytes
 tests/data/acpi/q35/DSDT.ivrs   | Bin 8424 -> 8429 bytes
 tests/data/acpi/q35/DSDT.memhp  | Bin 9766 -> 9771 bytes
 tests/data/acpi/q35/DSDT.mmio64 | Bin 9537 -> 9542 bytes
 tests/data/acpi/q35/DSDT.multi-bridge   | Bin 8727 -> 8732 bytes
 tests/data/acpi/q35/DSDT.nohpet | Bin 8265 -> 8270 bytes
 tests/data/acpi/q35/DSDT.numamem| Bin 8413 -> 8418 bytes
 tests/data/acpi/q35/DSDT.pvpanic-isa| Bin 8508 -> 8513 bytes
 tests/data/acpi/q35/DSDT.tis.tpm12  | Bin 9013 -> 9018 bytes
 tests/data/acpi/q35/DSDT.tis.tpm2   | Bin 9039 -> 9044 bytes
 tests/data/acpi/q35/DSDT.viot   | Bin 9516 -> 9521 bytes
 tests/data/acpi/q35/DSDT.xapic  | Bin 35770 -> 35775 bytes
 35 files changed, 34 deletions(-)

diff --git a/tests/qtest/bios-tables-test-allowed-diff.h 
b/tests/qtest/bios-tables-test-allowed-diff.h
index 725a1dc798..dfb8523c8b 100644
--- a/tests/qtest/bios-tables-test-allowed-diff.h
+++ b/tests/qtest/bios-tables-test-allowed-diff.h
@@ -1,35 +1 @@
 /* List of comma-separated changed AML files to ignore */
-"tests/data/acpi/pc/DSDT",
-"tests/data/acpi/pc/DSDT.bridge",
-"tests/data/acpi/pc/DSDT.ipmikcs",
-"tests/data/acpi/pc/DSDT.cphp",
-"tests/data/acpi/pc/DSDT.memhp",
-"tests/data/acpi/pc/DSDT.numamem",
-"tests/data/acpi/pc/DSDT.nohpet",
-"tests/data/acpi/pc/DSDT.dimmpxm",
-"tests/data/acpi/pc/DSDT.acpihmat",
-"tests/data/acpi/pc/DSDT.acpierst",
-"tests/data/acpi/pc/DSDT.roothp",
-"tests/data/acpi/pc/DSDT.hpbridge",
-"tests/data/acpi/pc/DSDT.hpbrroot",
-"tests/data/acpi/q35/DSDT",
-"tests/data/acpi/q35/DSDT.tis.tpm2",
-"tests/data/acpi/q35/DSDT.tis.tpm12",
-"tests/data/acpi/q35/DSDT.bridge",
-"tests/data/acpi/q35/DSDT.multi-bridge",
-"tests/data/acpi/q35/DSDT.mmio64",
-"tests/data/acpi/q35/DSDT.ipmibt",

[PULL v2 80/82] intel-iommu: drop VTDBus

2022-11-02 Thread Michael S. Tsirkin

From: Jason Wang 

We introduce VTDBus structure as an intermediate step for searching
the address space. This works well with SID based matching/lookup. But
when we want to support SID plus PASID based address space lookup,
this intermediate steps turns out to be a burden. So the patch simply
drops the VTDBus structure and use the PCIBus and devfn as the key for
the g_hash_table(). This simplifies the codes and the future PASID
extension.

To prevent being slower for past vtd_find_as_from_bus_num() callers, a
vtd_as cache indexed by the bus number is introduced to store the last
recent search result of a vtd_as belongs to a specific bus.

Reviewed-by: Peter Xu 
Signed-off-by: Jason Wang 
Message-Id: <20221028061436.30093-3-jasow...@redhat.com>
Reviewed-by: Michael S. Tsirkin 
Signed-off-by: Michael S. Tsirkin 
Reviewed-by: Yi Liu 
---
 include/hw/i386/intel_iommu.h |  11 +-
 hw/i386/intel_iommu.c | 234 +-
 2 files changed, 118 insertions(+), 127 deletions(-)

diff --git a/include/hw/i386/intel_iommu.h b/include/hw/i386/intel_iommu.h
index 67653b0f9b..e49fff2a6c 100644
--- a/include/hw/i386/intel_iommu.h
+++ b/include/hw/i386/intel_iommu.h
@@ -58,7 +58,6 @@ typedef struct VTDContextEntry VTDContextEntry;
 typedef struct VTDContextCacheEntry VTDContextCacheEntry;
 typedef struct VTDAddressSpace VTDAddressSpace;
 typedef struct VTDIOTLBEntry VTDIOTLBEntry;
-typedef struct VTDBus VTDBus;
 typedef union VTD_IR_TableEntry VTD_IR_TableEntry;
 typedef union VTD_IR_MSIAddress VTD_IR_MSIAddress;
 typedef struct VTDPASIDDirEntry VTDPASIDDirEntry;
@@ -111,12 +110,6 @@ struct VTDAddressSpace {
 IOVATree *iova_tree;  /* Traces mapped IOVA ranges */
 };
 
-struct VTDBus {
-PCIBus* bus;   /* A reference to the bus to provide 
translation for */
-/* A table of VTDAddressSpace objects indexed by devfn */
-VTDAddressSpace *dev_as[];
-};
-
 struct VTDIOTLBEntry {
 uint64_t gfn;
 uint16_t domain_id;
@@ -253,8 +246,8 @@ struct IntelIOMMUState {
 uint32_t context_cache_gen; /* Should be in [1,MAX] */
 GHashTable *iotlb;  /* IOTLB */
 
-GHashTable *vtd_as_by_busptr;   /* VTDBus objects indexed by PCIBus* 
reference */
-VTDBus *vtd_as_by_bus_num[VTD_PCI_BUS_MAX]; /* VTDBus objects indexed by 
bus number */
+GHashTable *vtd_address_spaces; /* VTD address spaces */
+VTDAddressSpace *vtd_as_cache[VTD_PCI_BUS_MAX]; /* VTD address space cache 
*/
 /* list of registered notifiers */
 QLIST_HEAD(, VTDAddressSpace) vtd_as_with_notifiers;
 
diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c
index 271de995be..9fe5a222eb 100644
--- a/hw/i386/intel_iommu.c
+++ b/hw/i386/intel_iommu.c
@@ -61,6 +61,16 @@
 } \
 }
 
+/*
+ * PCI bus number (or SID) is not reliable since the device is usaully
+ * initalized before guest can configure the PCI bridge
+ * (SECONDARY_BUS_NUMBER).
+ */
+struct vtd_as_key {
+PCIBus *bus;
+uint8_t devfn;
+};
+
 static void vtd_address_space_refresh_all(IntelIOMMUState *s);
 static void vtd_address_space_unmap(VTDAddressSpace *as, IOMMUNotifier *n);
 
@@ -210,6 +220,27 @@ static guint vtd_uint64_hash(gconstpointer v)
 return (guint)*(const uint64_t *)v;
 }
 
+static gboolean vtd_as_equal(gconstpointer v1, gconstpointer v2)
+{
+const struct vtd_as_key *key1 = v1;
+const struct vtd_as_key *key2 = v2;
+
+return (key1->bus == key2->bus) && (key1->devfn == key2->devfn);
+}
+
+/*
+ * Note that we use pointer to PCIBus as the key, so hashing/shifting
+ * based on the pointer value is intended. Note that we deal with
+ * collisions through vtd_as_equal().
+ */
+static guint vtd_as_hash(gconstpointer v)
+{
+const struct vtd_as_key *key = v;
+guint value = (guint)(uintptr_t)key->bus;
+
+return (guint)(value << 8 | key->devfn);
+}
+
 static gboolean vtd_hash_remove_by_domain(gpointer key, gpointer value,
   gpointer user_data)
 {
@@ -248,22 +279,14 @@ static gboolean vtd_hash_remove_by_page(gpointer key, 
gpointer value,
 static void vtd_reset_context_cache_locked(IntelIOMMUState *s)
 {
 VTDAddressSpace *vtd_as;
-VTDBus *vtd_bus;
-GHashTableIter bus_it;
-uint32_t devfn_it;
+GHashTableIter as_it;
 
 trace_vtd_context_cache_reset();
 
-g_hash_table_iter_init(_it, s->vtd_as_by_busptr);
+g_hash_table_iter_init(_it, s->vtd_address_spaces);
 
-while (g_hash_table_iter_next (_it, NULL, (void**)_bus)) {
-for (devfn_it = 0; devfn_it < PCI_DEVFN_MAX; ++devfn_it) {
-vtd_as = vtd_bus->dev_as[devfn_it];
-if (!vtd_as) {
-continue;
-}
-vtd_as->context_cache_entry.context_cache_gen = 0;
-}
+while (g_hash_table_iter_next (_it, NULL, (void**)_as)) {
+vtd_as->context_cache_entry.context_cache_gen = 0;
 }

[PULL v2 12/82] acpi/tests/avocado/bits: disable acpi PSS tests that are failing in biosbits

2022-11-02 Thread Michael S. Tsirkin

From: Ani Sinha 

PSS tests in acpi test suite seems to be failing in biosbits. This is because
the test is unable to find PSS support in QEMU bios. Let us disable
them for now so that make check does not fail. We can fix the tests and
re-enable them later.

Example failure:

 ACPI _PSS (Pstate) table conformance tests 
[assert] _PSS must exist FAIL
  \_SB_.CPUS.C000
  No _PSS exists
Summary: 1 passed, 1 failed
 ACPI _PSS (Pstate) runtime tests 
[assert] _PSS must exist FAIL
  \_SB_.CPUS.C000
  No _PSS exists
Summary: 0 passed, 1 failed

Cc: Daniel P. Berrangé 
Cc: Paolo Bonzini 
Cc: Maydell Peter 
Cc: John Snow 
Cc: Thomas Huth 
Cc: Alex Bennée 
Cc: Igor Mammedov 
Cc: Michael Tsirkin 
Signed-off-by: Ani Sinha 
Reviewed-by: Alex Bennée 
Message-Id: <20221021095108.104843-4-...@anisinha.ca>
Reviewed-by: Michael S. Tsirkin 
Signed-off-by: Michael S. Tsirkin 
---
 tests/avocado/acpi-bits/bits-tests/testacpi.py2 | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/tests/avocado/acpi-bits/bits-tests/testacpi.py2 
b/tests/avocado/acpi-bits/bits-tests/testacpi.py2
index 9ec452f330..dbc150076e 100644
--- a/tests/avocado/acpi-bits/bits-tests/testacpi.py2
+++ b/tests/avocado/acpi-bits/bits-tests/testacpi.py2
@@ -36,8 +36,8 @@ import time
 
 def register_tests():
 testsuite.add_test("ACPI _MAT (Multiple APIC Table Entry) under Processor 
objects", test_mat, submenu="ACPI Tests")
-testsuite.add_test("ACPI _PSS (Pstate) table conformance tests", test_pss, 
submenu="ACPI Tests")
-testsuite.add_test("ACPI _PSS (Pstate) runtime tests", test_pstates, 
submenu="ACPI Tests")
+#testsuite.add_test("ACPI _PSS (Pstate) table conformance tests", 
test_pss, submenu="ACPI Tests")
+#testsuite.add_test("ACPI _PSS (Pstate) runtime tests", test_pstates, 
submenu="ACPI Tests")
 testsuite.add_test("ACPI DSDT (Differentiated System Description Table)", 
test_dsdt, submenu="ACPI Tests")
 testsuite.add_test("ACPI FACP (Fixed ACPI Description Table)", test_facp, 
submenu="ACPI Tests")
 testsuite.add_test("ACPI HPET (High Precision Event Timer Table)", 
test_hpet, submenu="ACPI Tests")
-- 
MST

[PULL v2 51/82] vhost-user: Fix out of order vring host notification handling

2022-11-02 Thread Michael S. Tsirkin

From: Yajun Wu 

vhost backend sends host notification for every VQ. If backend creates
VQs in parallel, the VHOST_USER_SLAVE_VRING_HOST_NOTIFIER_MSG may
arrive to QEMU in different order than incremental queue index order.

For example VQ 1's message arrive earlier than VQ 0's:
After alloc VhostUserHostNotifier for VQ 1. GPtrArray becomes

[ nil, VQ1 pointer ]

After alloc VhostUserHostNotifier for VQ 0. GPtrArray becomes

[ VQ0 pointer, nil, VQ1 pointer ]

This is wrong. fetch_notifier will return NULL for VQ 1 in
vhost_user_get_vring_base, causes host notifier miss removal(leak).

The fix is to remove current element from GPtrArray, make the right
position for element to insert.

Fixes: 503e355465 ("virtio/vhost-user: dynamically assign 
VhostUserHostNotifiers")
Signed-off-by: Yajun Wu 
Acked-by: Parav Pandit 

Message-Id: <20221018023651.1359420-1-yaj...@nvidia.com>
Reviewed-by: Alex Bennée 
Reviewed-by: Michael S. Tsirkin 
Signed-off-by: Michael S. Tsirkin 
---
 hw/virtio/vhost-user.c | 5 +
 1 file changed, 5 insertions(+)

diff --git a/hw/virtio/vhost-user.c b/hw/virtio/vhost-user.c
index bb5164b753..abe23d4ebe 100644
--- a/hw/virtio/vhost-user.c
+++ b/hw/virtio/vhost-user.c
@@ -1593,6 +1593,11 @@ static VhostUserHostNotifier 
*fetch_or_create_notifier(VhostUserState *u,
 
 n = g_ptr_array_index(u->notifiers, idx);
 if (!n) {
+/*
+ * In case notification arrive out-of-order,
+ * make room for current index.
+ */
+g_ptr_array_remove_index(u->notifiers, idx);
 n = g_new0(VhostUserHostNotifier, 1);
 n->idx = idx;
 g_ptr_array_insert(u->notifiers, idx, n);
-- 
MST

[PULL v2 27/82] hw/mem/cxl-type3: Add MSIX support

2022-11-02 Thread Michael S. Tsirkin

From: Jonathan Cameron 

This will be used by several upcoming patch sets so break it out
such that it doesn't matter which one lands first.

Signed-off-by: Jonathan Cameron 
Message-Id: <20221014151045.24781-3-jonathan.came...@huawei.com>
Reviewed-by: Michael S. Tsirkin 
Signed-off-by: Michael S. Tsirkin 
---
 hw/mem/cxl_type3.c | 9 +
 1 file changed, 9 insertions(+)

diff --git a/hw/mem/cxl_type3.c b/hw/mem/cxl_type3.c
index a71bf1afeb..568c9d62f5 100644
--- a/hw/mem/cxl_type3.c
+++ b/hw/mem/cxl_type3.c
@@ -13,6 +13,7 @@
 #include "qemu/rcu.h"
 #include "sysemu/hostmem.h"
 #include "hw/cxl/cxl.h"
+#include "hw/pci/msix.h"
 
 /*
  * Null value of all Fs suggested by IEEE RA guidelines for use of
@@ -146,6 +147,8 @@ static void ct3_realize(PCIDevice *pci_dev, Error **errp)
 ComponentRegisters *regs = _cstate->crb;
 MemoryRegion *mr = >component_registers;
 uint8_t *pci_conf = pci_dev->config;
+unsigned short msix_num = 1;
+int i;
 
 if (!cxl_setup_memory(ct3d, errp)) {
 return;
@@ -180,6 +183,12 @@ static void ct3_realize(PCIDevice *pci_dev, Error **errp)
  PCI_BASE_ADDRESS_SPACE_MEMORY |
  PCI_BASE_ADDRESS_MEM_TYPE_64,
  >cxl_dstate.device_registers);
+
+/* MSI(-X) Initailization */
+msix_init_exclusive_bar(pci_dev, msix_num, 4, NULL);
+for (i = 0; i < msix_num; i++) {
+msix_vector_use(pci_dev, i);
+}
 }
 
 static void ct3_exit(PCIDevice *pci_dev)
-- 
MST

[PULL v2 55/82] tests: acpi: update expected DSDT after ISA bridge is moved directly under PCI host bridge

2022-11-02 Thread Michael S. Tsirkin

From: Igor Mammedov 

example of the change for PC machine with hotplug disabled on root buss (no 
BSEL case):

 -Field (PCI0.ISA.P40C, ByteAcc, NoLock, Preserve)
 +Field (S08.P40C, ByteAcc, NoLock, Preserve)

 ===
 -Scope (_SB.PCI0)
 -{
 -Device (ISA)
 -{
 -Name (_ADR, 0x0001)  // _ADR: Address
 -OperationRegion (P40C, PCI_Config, 0x60, 0x04)
 ...
 -}
 -}
 -
  Scope (_SB)
 ===
 +Device (S08)
 +{
 +Name (_ADR, 0x0001)  // _ADR: Address
 +OperationRegion (P40C, PCI_Config, 0x60, 0x04)
 ...
 +}
 +
  Device (S10)
  {
  Name (_ADR, 0x0002)  // _ADR: Address

with hotplug enabled on root bus (i.e. bus has BSEL configured),
a following addtional entries will be seen:

 +Name (ASUN, One)
 +Method (_DSM, 4, Serialized)  // _DSM: Device-Specific Method
 +{
 +Local0 = Package (0x02)
 +{
 +BSEL,
 +ASUN
 +}
 +Return (PDSM (Arg0, Arg1, Arg2, Arg3, Local0))
 +}

similar changes are expected for Q35 modulo:

 -Field (PCI0.ISA.PIRQ, ByteAcc, NoLock, Preserve)
 +Field (SF8.PIRQ, ByteAcc, NoLock, Preserve)

and bridge address

Signed-off-by: Igor Mammedov 
Message-Id: <20221017102146.2254096-5-imamm...@redhat.com>
Reviewed-by: Michael S. Tsirkin 
Signed-off-by: Michael S. Tsirkin 
---
 tests/qtest/bios-tables-test-allowed-diff.h |  34 
 tests/data/acpi/pc/DSDT | Bin 6422 -> 6496 bytes
 tests/data/acpi/pc/DSDT.acpierst| Bin 6382 -> 6456 bytes
 tests/data/acpi/pc/DSDT.acpihmat| Bin 7747 -> 7821 bytes
 tests/data/acpi/pc/DSDT.bridge  | Bin 9496 -> 9570 bytes
 tests/data/acpi/pc/DSDT.cphp| Bin 6886 -> 6960 bytes
 tests/data/acpi/pc/DSDT.dimmpxm | Bin 8076 -> 8150 bytes
 tests/data/acpi/pc/DSDT.hpbridge| Bin 6382 -> 6456 bytes
 tests/data/acpi/pc/DSDT.hpbrroot| Bin 3069 -> 3107 bytes
 tests/data/acpi/pc/DSDT.ipmikcs | Bin 6494 -> 6568 bytes
 tests/data/acpi/pc/DSDT.memhp   | Bin 7781 -> 7855 bytes
 tests/data/acpi/pc/DSDT.nohpet  | Bin 6280 -> 6354 bytes
 tests/data/acpi/pc/DSDT.numamem | Bin 6428 -> 6502 bytes
 tests/data/acpi/pc/DSDT.roothp  | Bin 6656 -> 6694 bytes
 tests/data/acpi/q35/DSDT| Bin 8320 -> 8418 bytes
 tests/data/acpi/q35/DSDT.acpierst   | Bin 8337 -> 8435 bytes
 tests/data/acpi/q35/DSDT.acpihmat   | Bin 9645 -> 9743 bytes
 tests/data/acpi/q35/DSDT.applesmc   | Bin 8366 -> 8464 bytes
 tests/data/acpi/q35/DSDT.bridge | Bin 11449 -> 11547 bytes
 tests/data/acpi/q35/DSDT.cphp   | Bin 8784 -> 8882 bytes
 tests/data/acpi/q35/DSDT.cxl| Bin 9646 -> 9744 bytes
 tests/data/acpi/q35/DSDT.dimmpxm| Bin 9974 -> 10072 bytes
 tests/data/acpi/q35/DSDT.ipmibt | Bin 8395 -> 8493 bytes
 tests/data/acpi/q35/DSDT.ipmismbus  | Bin 8409 -> 8507 bytes
 tests/data/acpi/q35/DSDT.ivrs   | Bin 8337 -> 8435 bytes
 tests/data/acpi/q35/DSDT.memhp  | Bin 9679 -> 9777 bytes
 tests/data/acpi/q35/DSDT.mmio64 | Bin 9450 -> 9548 bytes
 tests/data/acpi/q35/DSDT.multi-bridge   | Bin 8640 -> 8738 bytes
 tests/data/acpi/q35/DSDT.nohpet | Bin 8178 -> 8276 bytes
 tests/data/acpi/q35/DSDT.numamem| Bin 8326 -> 8424 bytes
 tests/data/acpi/q35/DSDT.pvpanic-isa| Bin 8421 -> 8519 bytes
 tests/data/acpi/q35/DSDT.tis.tpm12  | Bin 8926 -> 9024 bytes
 tests/data/acpi/q35/DSDT.tis.tpm2   | Bin 8952 -> 9050 bytes
 tests/data/acpi/q35/DSDT.viot   | Bin 9429 -> 9527 bytes
 tests/data/acpi/q35/DSDT.xapic  | Bin 35683 -> 35781 bytes
 35 files changed, 34 deletions(-)

diff --git a/tests/qtest/bios-tables-test-allowed-diff.h 
b/tests/qtest/bios-tables-test-allowed-diff.h
index 570b17478e..dfb8523c8b 100644
--- a/tests/qtest/bios-tables-test-allowed-diff.h
+++ b/tests/qtest/bios-tables-test-allowed-diff.h
@@ -1,35 +1 @@
 /* List of comma-separated changed AML files to ignore */
-"tests/data/acpi/pc/DSDT",
-"tests/data/acpi/pc/DSDT.acpierst",
-"tests/data/acpi/pc/DSDT.acpihmat",
-"tests/data/acpi/pc/DSDT.bridge",
-"tests/data/acpi/pc/DSDT.cphp",
-"tests/data/acpi/pc/DSDT.dimmpxm",
-"tests/data/acpi/pc/DSDT.hpbridge",
-"tests/data/acpi/pc/DSDT.hpbrroot",
-"tests/data/acpi/pc/DSDT.ipmikcs",
-"tests/data/acpi/pc/DSDT.memhp",
-"tests/data/acpi/pc/DSDT.nohpet",
-"tests/data/acpi/pc/DSDT.numamem",
-"tests/data/acpi/pc/DSDT.roothp",
-"tests/data/acpi/q35/DSDT",
-"tests/data/acpi/q35/DSDT.acpierst",
-"tests/data/acpi/q35/DSDT.acpihmat",

[PULL v2 28/82] hw/cxl/cdat: CXL CDAT Data Object Exchange implementation

2022-11-02 Thread Michael S. Tsirkin

From: Huai-Cheng Kuo 

The Data Object Exchange implementation of CXL Coherent Device Attribute
Table (CDAT). This implementation is referring to "Coherent Device
Attribute Table Specification, Rev. 1.03, July. 2022" and "Compute
Express Link Specification, Rev. 3.0, July. 2022"

This patch adds core support that will be shared by both
end-points and switch port emulation.

Signed-off-by: Huai-Cheng Kuo 
Signed-off-by: Chris Browy 
Signed-off-by: Jonathan Cameron 
Message-Id: <20221014151045.24781-4-jonathan.came...@huawei.com>
Reviewed-by: Michael S. Tsirkin 
Signed-off-by: Michael S. Tsirkin 
---
 include/hw/cxl/cxl_cdat.h  | 165 
 include/hw/cxl/cxl_component.h |   7 ++
 include/hw/cxl/cxl_device.h|   3 +
 include/hw/cxl/cxl_pci.h   |   1 +
 hw/cxl/cxl-cdat.c  | 224 +
 hw/cxl/meson.build |   1 +
 6 files changed, 401 insertions(+)
 create mode 100644 include/hw/cxl/cxl_cdat.h
 create mode 100644 hw/cxl/cxl-cdat.c

diff --git a/include/hw/cxl/cxl_cdat.h b/include/hw/cxl/cxl_cdat.h
new file mode 100644
index 00..52c232e912
--- /dev/null
+++ b/include/hw/cxl/cxl_cdat.h
@@ -0,0 +1,165 @@
+/*
+ * CXL CDAT Structure
+ *
+ * Copyright (C) 2021 Avery Design Systems, Inc.
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or later.
+ * See the COPYING file in the top-level directory.
+ */
+
+#ifndef CXL_CDAT_H
+#define CXL_CDAT_H
+
+#include "hw/cxl/cxl_pci.h"
+
+/*
+ * Reference:
+ *   Coherent Device Attribute Table (CDAT) Specification, Rev. 1.03, July. 
2022
+ *   Compute Express Link (CXL) Specification, Rev. 3.0, Aug. 2022
+ */
+/* Table Access DOE - CXL r3.0 8.1.11 */
+#define CXL_DOE_TABLE_ACCESS  2
+#define CXL_DOE_PROTOCOL_CDAT ((CXL_DOE_TABLE_ACCESS << 16) | 
CXL_VENDOR_ID)
+
+/* Read Entry - CXL r3.0 8.1.11.1 */
+#define CXL_DOE_TAB_TYPE_CDAT 0
+#define CXL_DOE_TAB_ENT_MAX 0x
+
+/* Read Entry Request - CXL r3.0 8.1.11.1 Table 8-13 */
+#define CXL_DOE_TAB_REQ 0
+typedef struct CDATReq {
+DOEHeader header;
+uint8_t req_code;
+uint8_t table_type;
+uint16_t entry_handle;
+} QEMU_PACKED CDATReq;
+
+/* Read Entry Response - CXL r3.0 8.1.11.1 Table 8-14 */
+#define CXL_DOE_TAB_RSP 0
+typedef struct CDATRsp {
+DOEHeader header;
+uint8_t rsp_code;
+uint8_t table_type;
+uint16_t entry_handle;
+} QEMU_PACKED CDATRsp;
+
+/* CDAT Table Format - CDAT Table 1 */
+#define CXL_CDAT_REV 2
+typedef struct CDATTableHeader {
+uint32_t length;
+uint8_t revision;
+uint8_t checksum;
+uint8_t reserved[6];
+uint32_t sequence;
+} QEMU_PACKED CDATTableHeader;
+
+/* CDAT Structure Types - CDAT Table 2 */
+typedef enum {
+CDAT_TYPE_DSMAS = 0,
+CDAT_TYPE_DSLBIS = 1,
+CDAT_TYPE_DSMSCIS = 2,
+CDAT_TYPE_DSIS = 3,
+CDAT_TYPE_DSEMTS = 4,
+CDAT_TYPE_SSLBIS = 5,
+} CDATType;
+
+typedef struct CDATSubHeader {
+uint8_t type;
+uint8_t reserved;
+uint16_t length;
+} CDATSubHeader;
+
+/* Device Scoped Memory Affinity Structure - CDAT Table 3 */
+typedef struct CDATDsmas {
+CDATSubHeader header;
+uint8_t DSMADhandle;
+uint8_t flags;
+#define CDAT_DSMAS_FLAG_NV  (1 << 2)
+#define CDAT_DSMAS_FLAG_SHAREABLE   (1 << 3)
+#define CDAT_DSMAS_FLAG_HW_COHERENT (1 << 4)
+#define CDAT_DSMAS_FLAG_DYNAMIC_CAP (1 << 5)
+uint16_t reserved;
+uint64_t DPA_base;
+uint64_t DPA_length;
+} QEMU_PACKED CDATDsmas;
+
+/* Device Scoped Latency and Bandwidth Information Structure - CDAT Table 5 */
+typedef struct CDATDslbis {
+CDATSubHeader header;
+uint8_t handle;
+/* Definitions of these fields refer directly to HMAT fields */
+uint8_t flags;
+uint8_t data_type;
+uint8_t reserved;
+uint64_t entry_base_unit;
+uint16_t entry[3];
+uint16_t reserved2;
+} QEMU_PACKED CDATDslbis;
+
+/* Device Scoped Memory Side Cache Information Structure - CDAT Table 6 */
+typedef struct CDATDsmscis {
+CDATSubHeader header;
+uint8_t DSMAS_handle;
+uint8_t reserved[3];
+uint64_t memory_side_cache_size;
+uint32_t cache_attributes;
+} QEMU_PACKED CDATDsmscis;
+
+/* Device Scoped Initiator Structure - CDAT Table 7 */
+typedef struct CDATDsis {
+CDATSubHeader header;
+uint8_t flags;
+uint8_t handle;
+uint16_t reserved;
+} QEMU_PACKED CDATDsis;
+
+/* Device Scoped EFI Memory Type Structure - CDAT Table 8 */
+typedef struct CDATDsemts {
+CDATSubHeader header;
+uint8_t DSMAS_handle;
+uint8_t EFI_memory_type_attr;
+uint16_t reserved;
+uint64_t DPA_offset;
+uint64_t DPA_length;
+} QEMU_PACKED CDATDsemts;
+
+/* Switch Scoped Latency and Bandwidth Information Structure - CDAT Table 9 */
+typedef struct CDATSslbisHeader {
+CDATSubHeader header;
+uint8_t data_type;
+uint8_t reserved[3];
+uint64_t entry_base_unit;
+} QEMU_PACKED CDATSslbisHeader;
+
+/* Switch Scoped Latency and Bandwidth Entry - CDAT Table 10 */
+typedef struct

[PULL v2 74/82] tests: Add HMAT AArch64/virt empty table files

2022-11-02 Thread Michael S. Tsirkin

From: Hesham Almatary 

Signed-off-by: Hesham Almatary 
Message-Id: <20221027100037.251-6-hesham.almat...@huawei.com>
Tested-by: Yicong Yang 
Reviewed-by: Michael S. Tsirkin 
Signed-off-by: Michael S. Tsirkin 
---
 tests/qtest/bios-tables-test-allowed-diff.h | 5 +
 tests/data/acpi/virt/APIC.acpihmatvirt  | 0
 tests/data/acpi/virt/DSDT.acpihmatvirt  | 0
 tests/data/acpi/virt/HMAT.acpihmatvirt  | 0
 tests/data/acpi/virt/PPTT.acpihmatvirt  | 0
 tests/data/acpi/virt/SRAT.acpihmatvirt  | 0
 6 files changed, 5 insertions(+)
 create mode 100644 tests/data/acpi/virt/APIC.acpihmatvirt
 create mode 100644 tests/data/acpi/virt/DSDT.acpihmatvirt
 create mode 100644 tests/data/acpi/virt/HMAT.acpihmatvirt
 create mode 100644 tests/data/acpi/virt/PPTT.acpihmatvirt
 create mode 100644 tests/data/acpi/virt/SRAT.acpihmatvirt

diff --git a/tests/qtest/bios-tables-test-allowed-diff.h 
b/tests/qtest/bios-tables-test-allowed-diff.h
index dfb8523c8b..4f849715bd 100644
--- a/tests/qtest/bios-tables-test-allowed-diff.h
+++ b/tests/qtest/bios-tables-test-allowed-diff.h
@@ -1 +1,6 @@
 /* List of comma-separated changed AML files to ignore */
+"tests/data/acpi/virt/APIC.acpihmatvirt",
+"tests/data/acpi/virt/DSDT.acpihmatvirt",
+"tests/data/acpi/virt/HMAT.acpihmatvirt",
+"tests/data/acpi/virt/PPTT.acpihmatvirt",
+"tests/data/acpi/virt/SRAT.acpihmatvirt",
diff --git a/tests/data/acpi/virt/APIC.acpihmatvirt 
b/tests/data/acpi/virt/APIC.acpihmatvirt
new file mode 100644
index 00..e69de29bb2
diff --git a/tests/data/acpi/virt/DSDT.acpihmatvirt 
b/tests/data/acpi/virt/DSDT.acpihmatvirt
new file mode 100644
index 00..e69de29bb2
diff --git a/tests/data/acpi/virt/HMAT.acpihmatvirt 
b/tests/data/acpi/virt/HMAT.acpihmatvirt
new file mode 100644
index 00..e69de29bb2
diff --git a/tests/data/acpi/virt/PPTT.acpihmatvirt 
b/tests/data/acpi/virt/PPTT.acpihmatvirt
new file mode 100644
index 00..e69de29bb2
diff --git a/tests/data/acpi/virt/SRAT.acpihmatvirt 
b/tests/data/acpi/virt/SRAT.acpihmatvirt
new file mode 100644
index 00..e69de29bb2
-- 
MST

[PULL v2 57/82] acpi: add get_dev_aml_func() helper

2022-11-02 Thread Michael S. Tsirkin

From: Igor Mammedov 

It will be used in followup commits to figure out if
device has it's own, device specific AML block.

Signed-off-by: Igor Mammedov 
Message-Id: <20221017102146.2254096-7-imamm...@redhat.com>
Reviewed-by: Michael S. Tsirkin 
Signed-off-by: Michael S. Tsirkin 
Reviewed-by: Ani Sinha 
---
 include/hw/acpi/acpi_aml_interface.h | 13 +++--
 1 file changed, 11 insertions(+), 2 deletions(-)

diff --git a/include/hw/acpi/acpi_aml_interface.h 
b/include/hw/acpi/acpi_aml_interface.h
index ab76f0e55d..436da069d6 100644
--- a/include/hw/acpi/acpi_aml_interface.h
+++ b/include/hw/acpi/acpi_aml_interface.h
@@ -29,11 +29,20 @@ struct AcpiDevAmlIfClass {
 dev_aml_fn build_dev_aml;
 };
 
-static inline void call_dev_aml_func(DeviceState *dev, Aml *scope)
+static inline dev_aml_fn get_dev_aml_func(DeviceState *dev)
 {
 if (object_dynamic_cast(OBJECT(dev), TYPE_ACPI_DEV_AML_IF)) {
 AcpiDevAmlIfClass *klass = ACPI_DEV_AML_IF_GET_CLASS(dev);
-klass->build_dev_aml(ACPI_DEV_AML_IF(dev), scope);
+return klass->build_dev_aml;
+}
+return NULL;
+}
+
+static inline void call_dev_aml_func(DeviceState *dev, Aml *scope)
+{
+dev_aml_fn fn = get_dev_aml_func(dev);
+if (fn) {
+fn(ACPI_DEV_AML_IF(dev), scope);
 }
 }
 
-- 
MST

[PULL v2 33/82] hw/virtio/virtio-iommu-pci: Enforce the device is plugged on the root bus

2022-11-02 Thread Michael S. Tsirkin

From: Eric Auger 

In theory the virtio-iommu-pci could be plugged anywhere in the PCIe
topology and as long as the dt/acpi info are properly built this should
work. However at the moment we fail to do that because the
virtio-iommu-pci BDF is not computed at plug time and in that case
vms->virtio_iommu_bdf gets an incorrect value.

For instance if the virtio-iommu-pci is plugged onto a pcie root port
and the virtio-iommu protects a virtio-block-pci device the guest does
not boot.

So let's do not pretend we do support this case and fail the initialize()
if we detect the virtio-iommu-pci is plugged anywhere else than on the
root bus. Anyway this ability is not needed.

Signed-off-by: Eric Auger 
Message-Id: <20221012163448.121368-1-eric.au...@redhat.com>
Reviewed-by: Jean-Philippe Brucker 
Tested-by: Jean-Philippe Brucker 
Reviewed-by: Michael S. Tsirkin 
Signed-off-by: Michael S. Tsirkin 
---
 hw/virtio/virtio-iommu-pci.c | 12 ++--
 1 file changed, 10 insertions(+), 2 deletions(-)

diff --git a/hw/virtio/virtio-iommu-pci.c b/hw/virtio/virtio-iommu-pci.c
index 79ea8334f0..7ef2f9dcdb 100644
--- a/hw/virtio/virtio-iommu-pci.c
+++ b/hw/virtio/virtio-iommu-pci.c
@@ -17,6 +17,7 @@
 #include "hw/qdev-properties-system.h"
 #include "qapi/error.h"
 #include "hw/boards.h"
+#include "hw/pci/pci_bus.h"
 #include "qom/object.h"
 
 typedef struct VirtIOIOMMUPCI VirtIOIOMMUPCI;
@@ -44,6 +45,7 @@ static Property virtio_iommu_pci_properties[] = {
 static void virtio_iommu_pci_realize(VirtIOPCIProxy *vpci_dev, Error **errp)
 {
 VirtIOIOMMUPCI *dev = VIRTIO_IOMMU_PCI(vpci_dev);
+PCIBus *pbus = pci_get_bus(_dev->pci_dev);
 DeviceState *vdev = DEVICE(>vdev);
 VirtIOIOMMU *s = VIRTIO_IOMMU(vdev);
 
@@ -57,11 +59,17 @@ static void virtio_iommu_pci_realize(VirtIOPCIProxy 
*vpci_dev, Error **errp)
 s->reserved_regions[i].type != VIRTIO_IOMMU_RESV_MEM_T_MSI) {
 error_setg(errp, "reserved region %d has an invalid type", i);
 error_append_hint(errp, "Valid values are 0 and 1\n");
+return;
 }
 }
+if (!pci_bus_is_root(pbus)) {
+error_setg(errp, "virtio-iommu-pci must be plugged on the root bus");
+return;
+}
+
 object_property_set_link(OBJECT(dev), "primary-bus",
- OBJECT(pci_get_bus(_dev->pci_dev)),
- _abort);
+ OBJECT(pbus), _abort);
+
 virtio_pci_force_virtio_1(vpci_dev);
 qdev_realize(vdev, BUS(_dev->bus), errp);
 }
-- 
MST

[PULL v2 56/82] tests: acpi: whitelist DSDT before generating ICH9_SMB AML automatically

2022-11-02 Thread Michael S. Tsirkin

From: Igor Mammedov 

Signed-off-by: Igor Mammedov 
Message-Id: <20221017102146.2254096-6-imamm...@redhat.com>
Reviewed-by: Michael S. Tsirkin 
Signed-off-by: Michael S. Tsirkin 
---
 tests/qtest/bios-tables-test-allowed-diff.h | 21 +
 1 file changed, 21 insertions(+)

diff --git a/tests/qtest/bios-tables-test-allowed-diff.h 
b/tests/qtest/bios-tables-test-allowed-diff.h
index dfb8523c8b..fd5852776c 100644
--- a/tests/qtest/bios-tables-test-allowed-diff.h
+++ b/tests/qtest/bios-tables-test-allowed-diff.h
@@ -1 +1,22 @@
 /* List of comma-separated changed AML files to ignore */
+"tests/data/acpi/q35/DSDT",
+"tests/data/acpi/q35/DSDT.acpierst",
+"tests/data/acpi/q35/DSDT.acpihmat",
+"tests/data/acpi/q35/DSDT.applesmc",
+"tests/data/acpi/q35/DSDT.bridge",
+"tests/data/acpi/q35/DSDT.cphp",
+"tests/data/acpi/q35/DSDT.cxl",
+"tests/data/acpi/q35/DSDT.dimmpxm",
+"tests/data/acpi/q35/DSDT.ipmibt",
+"tests/data/acpi/q35/DSDT.ipmismbus",
+"tests/data/acpi/q35/DSDT.ivrs",
+"tests/data/acpi/q35/DSDT.memhp",
+"tests/data/acpi/q35/DSDT.mmio64",
+"tests/data/acpi/q35/DSDT.multi-bridge",
+"tests/data/acpi/q35/DSDT.nohpet",
+"tests/data/acpi/q35/DSDT.numamem",
+"tests/data/acpi/q35/DSDT.pvpanic-isa",
+"tests/data/acpi/q35/DSDT.tis.tpm12",
+"tests/data/acpi/q35/DSDT.tis.tpm2",
+"tests/data/acpi/q35/DSDT.viot",
+"tests/data/acpi/q35/DSDT.xapic",
-- 
MST

1 2 3 >

1 - 100 of 202 matches

Mail list logo