Re: [PATCH] vhost-vdpa: skip TPM CRB memory section
On Wed, Nov 23, 2022 at 8:20 AM Marc-André Lureau wrote: > > Hi > > On Wed, Nov 23, 2022 at 12:32 AM Michael S. Tsirkin wrote: >> >> On Tue, Nov 22, 2022 at 06:53:49PM +0400, marcandre.lur...@redhat.com wrote: >> > From: Marc-André Lureau >> > >> > 851d6d1a0f ("vfio/common: remove spurious tpm-crb-cmd misalignment >> > warning") removed the warning on vfio_listener_region_add() path. >> > >> > An error is reported for vhost-vdpa case: >> > qemu-kvm: vhost_vdpa_listener_region_add received unaligned region >> > >> > Skip the CRB device. >> > >> > Fixes: >> > https://bugzilla.redhat.com/show_bug.cgi?id=2141965 >> > >> > Signed-off-by: Marc-André Lureau >> > --- >> > hw/virtio/vhost-vdpa.c | 6 ++ >> > 1 file changed, 6 insertions(+) >> > >> > diff --git a/hw/virtio/vhost-vdpa.c b/hw/virtio/vhost-vdpa.c >> > index 7468e44b87..9d7206e4b8 100644 >> > --- a/hw/virtio/vhost-vdpa.c >> > +++ b/hw/virtio/vhost-vdpa.c >> > @@ -19,6 +19,7 @@ >> > #include "hw/virtio/virtio-net.h" >> > #include "hw/virtio/vhost-shadow-virtqueue.h" >> > #include "hw/virtio/vhost-vdpa.h" >> > +#include "sysemu/tpm.h" >> > #include "exec/address-spaces.h" >> > #include "migration/blocker.h" >> > #include "qemu/cutils.h" >> > @@ -46,6 +47,11 @@ static bool >> > vhost_vdpa_listener_skipped_section(MemoryRegionSection *section, >> > { >> > Int128 llend; >> > >> > +if (TPM_IS_CRB(section->mr->owner)) { >> > +/* The CRB command buffer has its base address unaligned. */ >> > +return true; >> > +} >> > + >> >> Quite a hack. We can't really keep adding dependency on random devices > > > Agree it's not great. but it's not strictly a dependency. At least you can > still build with !CONFIG_TPM. > >> to vhost. And would you add hacks like this to listeners? >> Pls figure out what's special about this buffer. >> Also if this section is unaligned then doesn't it break up >> other aligned sections? > > > See the original discussion: > https://patchew.org/QEMU/20220208133842.112017-1-eric.au...@redhat.com/20220208133842.112017-2-eric.au...@redhat.com/ > Based on that, I'd tune the comment to something like: QEMU represents the CRB cmd/response buffer as a standard RAM region to the listeners, but real HW would not be able to access them. Besides, these regions may not be properly page aligned. Is that more accurate? Thanks! > It is not clear whether aligning the tpm-crb-cmd region would work > (overlapping tpm-crb-mmio). > > Peter Maydell said: "There's nothing that guarantees alignment for memory > regions at all, whether they're RAM, IO or anything else.". > > Maybe vfio/vhost should simply skip those odd regions silently. > > >> >> >> > if ((!memory_region_is_ram(section->mr) && >> > !memory_region_is_iommu(section->mr)) || >> > memory_region_is_protected(section->mr) || >> > -- >> > 2.38.1 >>
Re: [PATCH] vhost-vdpa: skip TPM CRB memory section
On Wed, Nov 23, 2022 at 11:20:41AM +0400, Marc-André Lureau wrote: > Hi > > On Wed, Nov 23, 2022 at 12:32 AM Michael S. Tsirkin wrote: > > On Tue, Nov 22, 2022 at 06:53:49PM +0400, marcandre.lur...@redhat.com > wrote: > > From: Marc-André Lureau > > > > 851d6d1a0f ("vfio/common: remove spurious tpm-crb-cmd misalignment > > warning") removed the warning on vfio_listener_region_add() path. > > > > An error is reported for vhost-vdpa case: > > qemu-kvm: vhost_vdpa_listener_region_add received unaligned region > > > > Skip the CRB device. > > > > Fixes: > > https://bugzilla.redhat.com/show_bug.cgi?id=2141965 > > > > Signed-off-by: Marc-André Lureau > > --- > > hw/virtio/vhost-vdpa.c | 6 ++ > > 1 file changed, 6 insertions(+) > > > > diff --git a/hw/virtio/vhost-vdpa.c b/hw/virtio/vhost-vdpa.c > > index 7468e44b87..9d7206e4b8 100644 > > --- a/hw/virtio/vhost-vdpa.c > > +++ b/hw/virtio/vhost-vdpa.c > > @@ -19,6 +19,7 @@ > > #include "hw/virtio/virtio-net.h" > > #include "hw/virtio/vhost-shadow-virtqueue.h" > > #include "hw/virtio/vhost-vdpa.h" > > +#include "sysemu/tpm.h" > > #include "exec/address-spaces.h" > > #include "migration/blocker.h" > > #include "qemu/cutils.h" > > @@ -46,6 +47,11 @@ static bool vhost_vdpa_listener_skipped_section > (MemoryRegionSection *section, > > { > > Int128 llend; > > > > + if (TPM_IS_CRB(section->mr->owner)) { > > + /* The CRB command buffer has its base address unaligned. */ > > + return true; > > + } > > + > > Quite a hack. We can't really keep adding dependency on random devices > > > Agree it's not great. but it's not strictly a dependency. At least you can > still build with !CONFIG_TPM. but what does it have to do with tpm? > > to vhost. And would you add hacks like this to listeners? > Pls figure out what's special about this buffer. > Also if this section is unaligned then doesn't it break up > other aligned sections? > > > See the original discussion: > https://patchew.org/QEMU/20220208133842.112017-1-eric.au...@redhat.com/ > 20220208133842.112017-2-eric.au...@redhat.com/ > > It is not clear whether aligning the tpm-crb-cmd region would work > (overlapping > tpm-crb-mmio). > > Peter Maydell said: "There's nothing that guarantees alignment for memory > regions at all, whether they're RAM, IO or anything else.". > > Maybe vfio/vhost should simply skip those odd regions silently. How do we detect them? Are these regions not DMA-able on real hardware? > > > > > > if ((!memory_region_is_ram(section->mr) && > > !memory_region_is_iommu(section->mr)) || > > memory_region_is_protected(section->mr) || > > -- > > 2.38.1 > >
Re: [PATCH] vhost-vdpa: skip TPM CRB memory section
Hi On Wed, Nov 23, 2022 at 12:32 AM Michael S. Tsirkin wrote: > On Tue, Nov 22, 2022 at 06:53:49PM +0400, marcandre.lur...@redhat.com > wrote: > > From: Marc-André Lureau > > > > 851d6d1a0f ("vfio/common: remove spurious tpm-crb-cmd misalignment > > warning") removed the warning on vfio_listener_region_add() path. > > > > An error is reported for vhost-vdpa case: > > qemu-kvm: vhost_vdpa_listener_region_add received unaligned region > > > > Skip the CRB device. > > > > Fixes: > > https://bugzilla.redhat.com/show_bug.cgi?id=2141965 > > > > Signed-off-by: Marc-André Lureau > > --- > > hw/virtio/vhost-vdpa.c | 6 ++ > > 1 file changed, 6 insertions(+) > > > > diff --git a/hw/virtio/vhost-vdpa.c b/hw/virtio/vhost-vdpa.c > > index 7468e44b87..9d7206e4b8 100644 > > --- a/hw/virtio/vhost-vdpa.c > > +++ b/hw/virtio/vhost-vdpa.c > > @@ -19,6 +19,7 @@ > > #include "hw/virtio/virtio-net.h" > > #include "hw/virtio/vhost-shadow-virtqueue.h" > > #include "hw/virtio/vhost-vdpa.h" > > +#include "sysemu/tpm.h" > > #include "exec/address-spaces.h" > > #include "migration/blocker.h" > > #include "qemu/cutils.h" > > @@ -46,6 +47,11 @@ static bool > vhost_vdpa_listener_skipped_section(MemoryRegionSection *section, > > { > > Int128 llend; > > > > +if (TPM_IS_CRB(section->mr->owner)) { > > +/* The CRB command buffer has its base address unaligned. */ > > +return true; > > +} > > + > > Quite a hack. We can't really keep adding dependency on random devices > Agree it's not great. but it's not strictly a dependency. At least you can still build with !CONFIG_TPM. to vhost. And would you add hacks like this to listeners? > Pls figure out what's special about this buffer. > Also if this section is unaligned then doesn't it break up > other aligned sections? > See the original discussion: https://patchew.org/QEMU/20220208133842.112017-1-eric.au...@redhat.com/20220208133842.112017-2-eric.au...@redhat.com/ It is not clear whether aligning the tpm-crb-cmd region would work (overlapping tpm-crb-mmio). Peter Maydell said: "There's nothing that guarantees alignment for memory regions at all, whether they're RAM, IO or anything else.". Maybe vfio/vhost should simply skip those odd regions silently. > > > if ((!memory_region_is_ram(section->mr) && > > !memory_region_is_iommu(section->mr)) || > > memory_region_is_protected(section->mr) || > > -- > > 2.38.1 > >
Re: [PATCH for 7.2?] vhost: fix vq dirt bitmap syncing when vIOMMU is enabled
On Wed, Nov 23, 2022 at 03:08:25PM +0800, Jason Wang wrote: > On Wed, Nov 23, 2022 at 2:15 PM Michael S. Tsirkin wrote: > > > > On Wed, Nov 23, 2022 at 01:47:04PM +0800, Jason Wang wrote: > > > On Wed, Nov 23, 2022 at 1:26 PM Jason Wang wrote: > > > > > > > > On Wed, Nov 23, 2022 at 12:28 AM Eric Auger > > > > wrote: > > > > > > > > > > Hi, > > > > > > > > > > On 11/22/22 10:43, Michael S. Tsirkin wrote: > > > > > > On Tue, Nov 22, 2022 at 11:01:11AM +0800, Jason Wang wrote: > > > > > >> When vIOMMU is enabled, the vq->used_phys is actually the IOVA not > > > > > >> GPA. So we need to translate it to GPA before the syncing > > > > > >> otherwise we > > > > > >> may hit the following crash since IOVA could be out of the scope of > > > > > >> the GPA log size. This could be noted when using virtio-IOMMU with > > > > > >> vhost using 1G memory. > > > > > > Noted how exactly? What does "using 1G memory" mean? > > > > > > > > > > We hit the following assertion: > > > > > > > > > > qemu-system-x86_64: ../hw/virtio/vhost.c:85: vhost_dev_sync_region: > > > > > Assertion `end / VHOST_LOG_CHUNK < dev->log_size' failed. > > > > > > > > > > On the last time vhost_get_log_size() is called it takes into account > > > > > 2 regions when computing the log_size: > > > > > qemu-system-x86_64: vhost_get_log_size region 0 last=0x9 updated > > > > > log_size=0x3 > > > > > qemu-system-x86_64: vhost_get_log_size region 1 last=0x3fff > > > > > updated log_size=0x1000 > > > > > so in vhost_migration_log() vhost_get_log_size(dev) returns 0x1000 > > > > > > > > > > In the test case, memory_region_sync_dirty_bitmap() gets called for > > > > > mem-machine_mem, vga.vram (several times) and eventually on pc.bios. > > > > > This latter is reponsible for the assertion: > > > > > > > > > > qemu-system-x86_64: vhost_log_sync calls sync_dirty_map on pc.bios > > > > > for the full range > > > > > qemu-system-x86_64: vhost_sync_dirty_bitmap calls > > > > > vhost_dev_sync_region on region 0 > > > > > qemu-system-x86_64: vhost_dev_sync_region end=0x9 < > > > > > start=0xfffc > > > > > qemu-system-x86_64: vhost_sync_dirty_bitmap calls > > > > > vhost_dev_sync_region on region 1 > > > > > qemu-system-x86_64: vhost_dev_sync_region end=0x3fff < > > > > > start=0xfffc > > > > > qemu-system-x86_64: vhost_sync_dirty_bitmap calls > > > > > vhost_dev_sync_region on vq 0 <- > > > > > qemu-system-x86_64: vhost_dev_sync_region pc.bios mfirst=0xfffc > > > > > mlast=0x rfirst=0xf240 rlast=0xfa45 > > > > > qemu-system-x86_64: vhost_dev_sync_region pc.bios end=0xfa45 > > > > > VHOST_LOG_CHUNK=0x4 end/VHOST_LOG_CHUNK=0x3fff > > > > > dev->log_size=0x1000 > > > > > qemu-system-x86_64: ../hw/virtio/vhost.c:85: vhost_dev_sync_region: > > > > > Assertion `end / VHOST_LOG_CHUNK < dev->log_size' failed. > > > > > > > > > > > > > > > > > > > > "using 1G memory": We hit the issue with a guest started with 1GB > > > > > initial RAM. > > > > > > > > Yes, so in the case the guest iova allocator may try to use an IOVA > > > > that is beyond 1G. > > > > > > > > > > > > > > > > > > > > > > > > > >> Fixes: c471ad0e9bd46 ("vhost_net: device IOTLB support") > > > > > >> Cc: qemu-sta...@nongnu.org > > > > > >> Reported-by: Yalan Zhang > > > > > >> Tested-by: Eric Auger > > > > > >> Tested-by: Lei Yang > > > > > >> Signed-off-by: Jason Wang > > > > > >> --- > > > > > >> hw/virtio/vhost.c | 65 > > > > > >> --- > > > > > >> 1 file changed, 45 insertions(+), 20 deletions(-) > > > > > >> > > > > > >> diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c > > > > > >> index d1c4c20b8c..26b319f34e 100644 > > > > > >> --- a/hw/virtio/vhost.c > > > > > >> +++ b/hw/virtio/vhost.c > > > > > >> @@ -106,11 +106,30 @@ static void vhost_dev_sync_region(struct > > > > > >> vhost_dev *dev, > > > > > >> } > > > > > >> } > > > > > >> > > > > > >> +static bool vhost_dev_has_iommu(struct vhost_dev *dev) > > > > > >> +{ > > > > > >> +VirtIODevice *vdev = dev->vdev; > > > > > >> + > > > > > >> +/* > > > > > >> + * For vhost, VIRTIO_F_IOMMU_PLATFORM means the backend > > > > > >> support > > > > > >> + * incremental memory mapping API via IOTLB API. For platform > > > > > >> that > > > > > >> + * does not have IOMMU, there's no need to enable this feature > > > > > >> + * which may cause unnecessary IOTLB miss/update transactions. > > > > > >> + */ > > > > > >> +if (vdev) { > > > > > >> +return virtio_bus_device_iommu_enabled(vdev) && > > > > > >> +virtio_host_has_feature(vdev, > > > > > >> VIRTIO_F_IOMMU_PLATFORM); > > > > > >> +} else { > > > > > >> +return false; > > > > > >> +} > > > > > >> +} > > > > > >> + > > > > > >> static int vhost_sync_dirty_bitmap(struct vhost_dev *dev, > > > > > >> MemoryRegionSection *section, > > > > > >>
Re: [PATCH for 7.2?] vhost: fix vq dirt bitmap syncing when vIOMMU is enabled
On Wed, Nov 23, 2022 at 2:15 PM Michael S. Tsirkin wrote: > > On Wed, Nov 23, 2022 at 01:47:04PM +0800, Jason Wang wrote: > > On Wed, Nov 23, 2022 at 1:26 PM Jason Wang wrote: > > > > > > On Wed, Nov 23, 2022 at 12:28 AM Eric Auger wrote: > > > > > > > > Hi, > > > > > > > > On 11/22/22 10:43, Michael S. Tsirkin wrote: > > > > > On Tue, Nov 22, 2022 at 11:01:11AM +0800, Jason Wang wrote: > > > > >> When vIOMMU is enabled, the vq->used_phys is actually the IOVA not > > > > >> GPA. So we need to translate it to GPA before the syncing otherwise > > > > >> we > > > > >> may hit the following crash since IOVA could be out of the scope of > > > > >> the GPA log size. This could be noted when using virtio-IOMMU with > > > > >> vhost using 1G memory. > > > > > Noted how exactly? What does "using 1G memory" mean? > > > > > > > > We hit the following assertion: > > > > > > > > qemu-system-x86_64: ../hw/virtio/vhost.c:85: vhost_dev_sync_region: > > > > Assertion `end / VHOST_LOG_CHUNK < dev->log_size' failed. > > > > > > > > On the last time vhost_get_log_size() is called it takes into account 2 > > > > regions when computing the log_size: > > > > qemu-system-x86_64: vhost_get_log_size region 0 last=0x9 updated > > > > log_size=0x3 > > > > qemu-system-x86_64: vhost_get_log_size region 1 last=0x3fff updated > > > > log_size=0x1000 > > > > so in vhost_migration_log() vhost_get_log_size(dev) returns 0x1000 > > > > > > > > In the test case, memory_region_sync_dirty_bitmap() gets called for > > > > mem-machine_mem, vga.vram (several times) and eventually on pc.bios. > > > > This latter is reponsible for the assertion: > > > > > > > > qemu-system-x86_64: vhost_log_sync calls sync_dirty_map on pc.bios for > > > > the full range > > > > qemu-system-x86_64: vhost_sync_dirty_bitmap calls vhost_dev_sync_region > > > > on region 0 > > > > qemu-system-x86_64: vhost_dev_sync_region end=0x9 < start=0xfffc > > > > qemu-system-x86_64: vhost_sync_dirty_bitmap calls vhost_dev_sync_region > > > > on region 1 > > > > qemu-system-x86_64: vhost_dev_sync_region end=0x3fff < > > > > start=0xfffc > > > > qemu-system-x86_64: vhost_sync_dirty_bitmap calls vhost_dev_sync_region > > > > on vq 0 <- > > > > qemu-system-x86_64: vhost_dev_sync_region pc.bios mfirst=0xfffc > > > > mlast=0x rfirst=0xf240 rlast=0xfa45 > > > > qemu-system-x86_64: vhost_dev_sync_region pc.bios end=0xfa45 > > > > VHOST_LOG_CHUNK=0x4 end/VHOST_LOG_CHUNK=0x3fff dev->log_size=0x1000 > > > > qemu-system-x86_64: ../hw/virtio/vhost.c:85: vhost_dev_sync_region: > > > > Assertion `end / VHOST_LOG_CHUNK < dev->log_size' failed. > > > > > > > > > > > > > > > > "using 1G memory": We hit the issue with a guest started with 1GB > > > > initial RAM. > > > > > > Yes, so in the case the guest iova allocator may try to use an IOVA > > > that is beyond 1G. > > > > > > > > > > > > > > > > > > > > >> Fixes: c471ad0e9bd46 ("vhost_net: device IOTLB support") > > > > >> Cc: qemu-sta...@nongnu.org > > > > >> Reported-by: Yalan Zhang > > > > >> Tested-by: Eric Auger > > > > >> Tested-by: Lei Yang > > > > >> Signed-off-by: Jason Wang > > > > >> --- > > > > >> hw/virtio/vhost.c | 65 > > > > >> --- > > > > >> 1 file changed, 45 insertions(+), 20 deletions(-) > > > > >> > > > > >> diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c > > > > >> index d1c4c20b8c..26b319f34e 100644 > > > > >> --- a/hw/virtio/vhost.c > > > > >> +++ b/hw/virtio/vhost.c > > > > >> @@ -106,11 +106,30 @@ static void vhost_dev_sync_region(struct > > > > >> vhost_dev *dev, > > > > >> } > > > > >> } > > > > >> > > > > >> +static bool vhost_dev_has_iommu(struct vhost_dev *dev) > > > > >> +{ > > > > >> +VirtIODevice *vdev = dev->vdev; > > > > >> + > > > > >> +/* > > > > >> + * For vhost, VIRTIO_F_IOMMU_PLATFORM means the backend support > > > > >> + * incremental memory mapping API via IOTLB API. For platform > > > > >> that > > > > >> + * does not have IOMMU, there's no need to enable this feature > > > > >> + * which may cause unnecessary IOTLB miss/update transactions. > > > > >> + */ > > > > >> +if (vdev) { > > > > >> +return virtio_bus_device_iommu_enabled(vdev) && > > > > >> +virtio_host_has_feature(vdev, VIRTIO_F_IOMMU_PLATFORM); > > > > >> +} else { > > > > >> +return false; > > > > >> +} > > > > >> +} > > > > >> + > > > > >> static int vhost_sync_dirty_bitmap(struct vhost_dev *dev, > > > > >> MemoryRegionSection *section, > > > > >> hwaddr first, > > > > >> hwaddr last) > > > > >> { > > > > >> +IOMMUTLBEntry iotlb; > > > > > why don't we move this inside the scope where it's used? > > > > > > That's fine. > > > > > > > > > > > > >> int i; > > > > >> hwaddr start_addr; > > > > >> hwaddr end_addr; > >
[PATCH 3/2] ppc4xx_sdram: Simplify sdram_ddr_size() to return
Suggested-by: BALATON Zoltan Signed-off-by: Markus Armbruster --- hw/ppc/ppc4xx_sdram.c | 10 +++--- 1 file changed, 3 insertions(+), 7 deletions(-) diff --git a/hw/ppc/ppc4xx_sdram.c b/hw/ppc/ppc4xx_sdram.c index 54bf9a2b44..a24c80b1d2 100644 --- a/hw/ppc/ppc4xx_sdram.c +++ b/hw/ppc/ppc4xx_sdram.c @@ -192,17 +192,13 @@ static inline hwaddr sdram_ddr_base(uint32_t bcr) static hwaddr sdram_ddr_size(uint32_t bcr) { -hwaddr size; -int sh; +int sh = (bcr >> 17) & 0x7; -sh = (bcr >> 17) & 0x7; if (sh == 7) { -size = -1; -} else { -size = (4 * MiB) << sh; +return -1; } -return size; +return (4 * MiB) << sh; } static uint32_t sdram_ddr_dcr_read(void *opaque, int dcrn) -- 2.37.3
Re: [PATCH for-7.2 v2 3/4] libvhost-user: Fix two more format strings
On Wed, Nov 23, 2022 at 07:49:53AM +0100, Stefan Weil wrote: > Am 23.11.22 um 07:35 schrieb Stefan Weil: > > Am 15.11.22 um 08:25 schrieb Stefan Weil: > > > Am 05.11.22 um 11:24 schrieb Stefan Weil: > > > > > > > This fix is required for 32 bit host. The bug was detected by CI > > > > for arm-linux, but is also relevant for i386-linux. > > > > > > > > Reported-by: Stefan Hajnoczi > > > > Signed-off-by: Stefan Weil > > > > --- > > > > subprojects/libvhost-user/libvhost-user.c | 3 ++- > > > > 1 file changed, 2 insertions(+), 1 deletion(-) > > > > > > > > diff --git a/subprojects/libvhost-user/libvhost-user.c > > > > b/subprojects/libvhost-user/libvhost-user.c > > > > index d67953a1c3..80f9952e71 100644 > > > > --- a/subprojects/libvhost-user/libvhost-user.c > > > > +++ b/subprojects/libvhost-user/libvhost-user.c > > > > @@ -651,7 +651,8 @@ generate_faults(VuDev *dev) { > > > > if (ioctl(dev->postcopy_ufd, UFFDIO_REGISTER, _struct)) { > > > > vu_panic(dev, "%s: Failed to userfault region %d " > > > > - "@%" PRIx64 " + size:%zx offset: %zx: > > > > (ufd=%d)%s\n", > > > > + "@%" PRIx64 " + size:%" PRIx64 " > > > > offset: %" PRIx64 > > > > + ": (ufd=%d)%s\n", > > > > __func__, i, > > > > dev_region->mmap_addr, > > > > dev_region->size, dev_region->mmap_offset, > > > > > > > > > There is still no review for this patch. > > > > > > I added "for-7.2" to the subject here in my answer. How can we get > > > all 4 patches of this series into the new release? > > > > > > Stefan > > > > Ping? Those bug fixes are still missing in -rc2. > > > > Stefan > > Hello Michael, > > I just noticed that MAINTAINERS has no entry for the files in > subprojects/libvhost-user, so I did not cc you in my previous e-mails. > Should that directory be added to the "vhost" section"? > > Stefan pls do
Re: [PATCH for-7.2 v2 3/4] libvhost-user: Fix two more format strings
Am 23.11.22 um 07:35 schrieb Stefan Weil: Am 15.11.22 um 08:25 schrieb Stefan Weil: Am 05.11.22 um 11:24 schrieb Stefan Weil: This fix is required for 32 bit host. The bug was detected by CI for arm-linux, but is also relevant for i386-linux. Reported-by: Stefan Hajnoczi Signed-off-by: Stefan Weil --- subprojects/libvhost-user/libvhost-user.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/subprojects/libvhost-user/libvhost-user.c b/subprojects/libvhost-user/libvhost-user.c index d67953a1c3..80f9952e71 100644 --- a/subprojects/libvhost-user/libvhost-user.c +++ b/subprojects/libvhost-user/libvhost-user.c @@ -651,7 +651,8 @@ generate_faults(VuDev *dev) { if (ioctl(dev->postcopy_ufd, UFFDIO_REGISTER, _struct)) { vu_panic(dev, "%s: Failed to userfault region %d " - "@%" PRIx64 " + size:%zx offset: %zx: (ufd=%d)%s\n", + "@%" PRIx64 " + size:%" PRIx64 " offset: %" PRIx64 + ": (ufd=%d)%s\n", __func__, i, dev_region->mmap_addr, dev_region->size, dev_region->mmap_offset, There is still no review for this patch. I added "for-7.2" to the subject here in my answer. How can we get all 4 patches of this series into the new release? Stefan Ping? Those bug fixes are still missing in -rc2. Stefan Hello Michael, I just noticed that MAINTAINERS has no entry for the files in subprojects/libvhost-user, so I did not cc you in my previous e-mails. Should that directory be added to the "vhost" section"? Stefan
Re: [PATCH v5 1/2] sysemu: tpm: Add a stub function for TPM_IS_CRB
On Fri, May 06, 2022 at 09:47:52AM -0400, Stefan Berger wrote: > > > On 5/6/22 09:25, Eric Auger wrote: > > In a subsequent patch, VFIO will need to recognize if > > a memory region owner is a TPM CRB device. Hence VFIO > > needs to use TPM_IS_CRB() even if CONFIG_TPM is unset. So > > let's add a stub function. > > > > Signed-off-by: Eric Auger > > Suggested-by: Cornelia Huck > Reviewed-by: Stefan Berger ... and now in 7.2 vdpa needs a dependency on tpm too, what a hack :( And what exactly is it about TPM CRB that everyone needs to know about it and skip it? The API does not tell ... > > --- > > include/sysemu/tpm.h | 6 ++ > > 1 file changed, 6 insertions(+) > > > > diff --git a/include/sysemu/tpm.h b/include/sysemu/tpm.h > > index 68b2206463c..fb40e30ff60 100644 > > --- a/include/sysemu/tpm.h > > +++ b/include/sysemu/tpm.h > > @@ -80,6 +80,12 @@ static inline TPMVersion tpm_get_version(TPMIf *ti) > > #define tpm_init() (0) > > #define tpm_cleanup() > > > > +/* needed for an alignment check in non-tpm code */ > > +static inline Object *TPM_IS_CRB(Object *obj) > > +{ > > + return NULL; > > +} > > + > > #endif /* CONFIG_TPM */ > > > > #endif /* QEMU_TPM_H */ > >
Re: [PATCH for-7.2 v2 3/4] libvhost-user: Fix two more format strings
Am 15.11.22 um 08:25 schrieb Stefan Weil: Am 05.11.22 um 11:24 schrieb Stefan Weil: This fix is required for 32 bit host. The bug was detected by CI for arm-linux, but is also relevant for i386-linux. Reported-by: Stefan Hajnoczi Signed-off-by: Stefan Weil --- subprojects/libvhost-user/libvhost-user.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/subprojects/libvhost-user/libvhost-user.c b/subprojects/libvhost-user/libvhost-user.c index d67953a1c3..80f9952e71 100644 --- a/subprojects/libvhost-user/libvhost-user.c +++ b/subprojects/libvhost-user/libvhost-user.c @@ -651,7 +651,8 @@ generate_faults(VuDev *dev) { if (ioctl(dev->postcopy_ufd, UFFDIO_REGISTER, _struct)) { vu_panic(dev, "%s: Failed to userfault region %d " - "@%" PRIx64 " + size:%zx offset: %zx: (ufd=%d)%s\n", + "@%" PRIx64 " + size:%" PRIx64 " offset: %" PRIx64 + ": (ufd=%d)%s\n", __func__, i, dev_region->mmap_addr, dev_region->size, dev_region->mmap_offset, There is still no review for this patch. I added "for-7.2" to the subject here in my answer. How can we get all 4 patches of this series into the new release? Stefan Ping? Those bug fixes are still missing in -rc2. Stefan OpenPGP_0xE08C21D5677450AD.asc Description: OpenPGP public key OpenPGP_signature Description: OpenPGP digital signature
Re: [PATCH v5 0/2] vfio/common: remove spurious tpm-crb-cmd misalignment warning
On Fri, May 06, 2022 at 03:25:08PM +0200, Eric Auger wrote: > The CRB command buffer currently is a RAM MemoryRegion and given > its base address alignment, it causes an error report on > vfio_listener_region_add(). This region could have been a RAM device > region, easing the detection of such safe situation but this option > was not well received. Eric could you point me at this discussion please? We are now asked to proliferate stuff like this into vdpa as well, this just doesn't scale. I'd like to see whether we can make it a RAM device region after all - was a patch like that posted? > So let's add a helper function that uses the > memory region owner type to detect the situation is safe wrt > the assignment. Other device types can be checked here if such kind > of problem occurs again. > > As TPM devices can be compiled out we need to introduce a stub > for TPM_IS_CRB. > > Best Regards > > Eric > > This series can be found at: > https://github.com/eauger/qemu/tree/tpm-crb-vfio-v5 > > History: > > v4 -> v5: > - Add sysemu: tpm: Add a stub function for TPM_IS_CRB to fix > compilation error if CONFIG_TPM is unset > > Eric Auger (2): > sysemu: tpm: Add a stub function for TPM_IS_CRB > vfio/common: remove spurious tpm-crb-cmd misalignment warning > > hw/vfio/common.c | 27 ++- > hw/vfio/trace-events | 1 + > include/sysemu/tpm.h | 6 ++ > 3 files changed, 33 insertions(+), 1 deletion(-) > > -- > 2.35.1 > > >
Re: [PATCH v1 5/9] hw/virtio: introduce virtio_device_should_start
Am 21.11.22 um 23:37 schrieb Michael S. Tsirkin: [...] qemu-system-x86_64: ../hw/virtio/vhost-vsock-common.c:203: vhost_vsock_common_pre_save: Assertion `!vhost_dev_is_started(>vhost_dev)' failed. 2022-11-15 16:38:46.096+: shutting down, reason=crashed Alex were you able to replicate? Just curious. Ping?
Re: [PATCH for 7.2?] vhost: fix vq dirt bitmap syncing when vIOMMU is enabled
On Wed, Nov 23, 2022 at 01:47:04PM +0800, Jason Wang wrote: > On Wed, Nov 23, 2022 at 1:26 PM Jason Wang wrote: > > > > On Wed, Nov 23, 2022 at 12:28 AM Eric Auger wrote: > > > > > > Hi, > > > > > > On 11/22/22 10:43, Michael S. Tsirkin wrote: > > > > On Tue, Nov 22, 2022 at 11:01:11AM +0800, Jason Wang wrote: > > > >> When vIOMMU is enabled, the vq->used_phys is actually the IOVA not > > > >> GPA. So we need to translate it to GPA before the syncing otherwise we > > > >> may hit the following crash since IOVA could be out of the scope of > > > >> the GPA log size. This could be noted when using virtio-IOMMU with > > > >> vhost using 1G memory. > > > > Noted how exactly? What does "using 1G memory" mean? > > > > > > We hit the following assertion: > > > > > > qemu-system-x86_64: ../hw/virtio/vhost.c:85: vhost_dev_sync_region: > > > Assertion `end / VHOST_LOG_CHUNK < dev->log_size' failed. > > > > > > On the last time vhost_get_log_size() is called it takes into account 2 > > > regions when computing the log_size: > > > qemu-system-x86_64: vhost_get_log_size region 0 last=0x9 updated > > > log_size=0x3 > > > qemu-system-x86_64: vhost_get_log_size region 1 last=0x3fff updated > > > log_size=0x1000 > > > so in vhost_migration_log() vhost_get_log_size(dev) returns 0x1000 > > > > > > In the test case, memory_region_sync_dirty_bitmap() gets called for > > > mem-machine_mem, vga.vram (several times) and eventually on pc.bios. This > > > latter is reponsible for the assertion: > > > > > > qemu-system-x86_64: vhost_log_sync calls sync_dirty_map on pc.bios for > > > the full range > > > qemu-system-x86_64: vhost_sync_dirty_bitmap calls vhost_dev_sync_region > > > on region 0 > > > qemu-system-x86_64: vhost_dev_sync_region end=0x9 < start=0xfffc > > > qemu-system-x86_64: vhost_sync_dirty_bitmap calls vhost_dev_sync_region > > > on region 1 > > > qemu-system-x86_64: vhost_dev_sync_region end=0x3fff < > > > start=0xfffc > > > qemu-system-x86_64: vhost_sync_dirty_bitmap calls vhost_dev_sync_region > > > on vq 0 <- > > > qemu-system-x86_64: vhost_dev_sync_region pc.bios mfirst=0xfffc > > > mlast=0x rfirst=0xf240 rlast=0xfa45 > > > qemu-system-x86_64: vhost_dev_sync_region pc.bios end=0xfa45 > > > VHOST_LOG_CHUNK=0x4 end/VHOST_LOG_CHUNK=0x3fff dev->log_size=0x1000 > > > qemu-system-x86_64: ../hw/virtio/vhost.c:85: vhost_dev_sync_region: > > > Assertion `end / VHOST_LOG_CHUNK < dev->log_size' failed. > > > > > > > > > > > > "using 1G memory": We hit the issue with a guest started with 1GB initial > > > RAM. > > > > Yes, so in the case the guest iova allocator may try to use an IOVA > > that is beyond 1G. > > > > > > > > > > > > > > > >> Fixes: c471ad0e9bd46 ("vhost_net: device IOTLB support") > > > >> Cc: qemu-sta...@nongnu.org > > > >> Reported-by: Yalan Zhang > > > >> Tested-by: Eric Auger > > > >> Tested-by: Lei Yang > > > >> Signed-off-by: Jason Wang > > > >> --- > > > >> hw/virtio/vhost.c | 65 --- > > > >> 1 file changed, 45 insertions(+), 20 deletions(-) > > > >> > > > >> diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c > > > >> index d1c4c20b8c..26b319f34e 100644 > > > >> --- a/hw/virtio/vhost.c > > > >> +++ b/hw/virtio/vhost.c > > > >> @@ -106,11 +106,30 @@ static void vhost_dev_sync_region(struct > > > >> vhost_dev *dev, > > > >> } > > > >> } > > > >> > > > >> +static bool vhost_dev_has_iommu(struct vhost_dev *dev) > > > >> +{ > > > >> +VirtIODevice *vdev = dev->vdev; > > > >> + > > > >> +/* > > > >> + * For vhost, VIRTIO_F_IOMMU_PLATFORM means the backend support > > > >> + * incremental memory mapping API via IOTLB API. For platform that > > > >> + * does not have IOMMU, there's no need to enable this feature > > > >> + * which may cause unnecessary IOTLB miss/update transactions. > > > >> + */ > > > >> +if (vdev) { > > > >> +return virtio_bus_device_iommu_enabled(vdev) && > > > >> +virtio_host_has_feature(vdev, VIRTIO_F_IOMMU_PLATFORM); > > > >> +} else { > > > >> +return false; > > > >> +} > > > >> +} > > > >> + > > > >> static int vhost_sync_dirty_bitmap(struct vhost_dev *dev, > > > >> MemoryRegionSection *section, > > > >> hwaddr first, > > > >> hwaddr last) > > > >> { > > > >> +IOMMUTLBEntry iotlb; > > > > why don't we move this inside the scope where it's used? > > > > That's fine. > > > > > > > > > >> int i; > > > >> hwaddr start_addr; > > > >> hwaddr end_addr; > > > >> @@ -132,13 +151,37 @@ static int vhost_sync_dirty_bitmap(struct > > > >> vhost_dev *dev, > > > >> } > > > >> for (i = 0; i < dev->nvqs; ++i) { > > > >> struct vhost_virtqueue *vq = dev->vqs + i; > > > >> +hwaddr used_phys = vq->used_phys, used_size = vq->used_size; > > > >> +
Re: [PATCH for 7.2?] vhost: fix vq dirt bitmap syncing when vIOMMU is enabled
On Wed, Nov 23, 2022 at 1:26 PM Jason Wang wrote: > > On Wed, Nov 23, 2022 at 12:28 AM Eric Auger wrote: > > > > Hi, > > > > On 11/22/22 10:43, Michael S. Tsirkin wrote: > > > On Tue, Nov 22, 2022 at 11:01:11AM +0800, Jason Wang wrote: > > >> When vIOMMU is enabled, the vq->used_phys is actually the IOVA not > > >> GPA. So we need to translate it to GPA before the syncing otherwise we > > >> may hit the following crash since IOVA could be out of the scope of > > >> the GPA log size. This could be noted when using virtio-IOMMU with > > >> vhost using 1G memory. > > > Noted how exactly? What does "using 1G memory" mean? > > > > We hit the following assertion: > > > > qemu-system-x86_64: ../hw/virtio/vhost.c:85: vhost_dev_sync_region: > > Assertion `end / VHOST_LOG_CHUNK < dev->log_size' failed. > > > > On the last time vhost_get_log_size() is called it takes into account 2 > > regions when computing the log_size: > > qemu-system-x86_64: vhost_get_log_size region 0 last=0x9 updated > > log_size=0x3 > > qemu-system-x86_64: vhost_get_log_size region 1 last=0x3fff updated > > log_size=0x1000 > > so in vhost_migration_log() vhost_get_log_size(dev) returns 0x1000 > > > > In the test case, memory_region_sync_dirty_bitmap() gets called for > > mem-machine_mem, vga.vram (several times) and eventually on pc.bios. This > > latter is reponsible for the assertion: > > > > qemu-system-x86_64: vhost_log_sync calls sync_dirty_map on pc.bios for the > > full range > > qemu-system-x86_64: vhost_sync_dirty_bitmap calls vhost_dev_sync_region on > > region 0 > > qemu-system-x86_64: vhost_dev_sync_region end=0x9 < start=0xfffc > > qemu-system-x86_64: vhost_sync_dirty_bitmap calls vhost_dev_sync_region on > > region 1 > > qemu-system-x86_64: vhost_dev_sync_region end=0x3fff < start=0xfffc > > qemu-system-x86_64: vhost_sync_dirty_bitmap calls vhost_dev_sync_region on > > vq 0 <- > > qemu-system-x86_64: vhost_dev_sync_region pc.bios mfirst=0xfffc > > mlast=0x rfirst=0xf240 rlast=0xfa45 > > qemu-system-x86_64: vhost_dev_sync_region pc.bios end=0xfa45 > > VHOST_LOG_CHUNK=0x4 end/VHOST_LOG_CHUNK=0x3fff dev->log_size=0x1000 > > qemu-system-x86_64: ../hw/virtio/vhost.c:85: vhost_dev_sync_region: > > Assertion `end / VHOST_LOG_CHUNK < dev->log_size' failed. > > > > > > > > "using 1G memory": We hit the issue with a guest started with 1GB initial > > RAM. > > Yes, so in the case the guest iova allocator may try to use an IOVA > that is beyond 1G. > > > > > > > > > > >> Fixes: c471ad0e9bd46 ("vhost_net: device IOTLB support") > > >> Cc: qemu-sta...@nongnu.org > > >> Reported-by: Yalan Zhang > > >> Tested-by: Eric Auger > > >> Tested-by: Lei Yang > > >> Signed-off-by: Jason Wang > > >> --- > > >> hw/virtio/vhost.c | 65 --- > > >> 1 file changed, 45 insertions(+), 20 deletions(-) > > >> > > >> diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c > > >> index d1c4c20b8c..26b319f34e 100644 > > >> --- a/hw/virtio/vhost.c > > >> +++ b/hw/virtio/vhost.c > > >> @@ -106,11 +106,30 @@ static void vhost_dev_sync_region(struct vhost_dev > > >> *dev, > > >> } > > >> } > > >> > > >> +static bool vhost_dev_has_iommu(struct vhost_dev *dev) > > >> +{ > > >> +VirtIODevice *vdev = dev->vdev; > > >> + > > >> +/* > > >> + * For vhost, VIRTIO_F_IOMMU_PLATFORM means the backend support > > >> + * incremental memory mapping API via IOTLB API. For platform that > > >> + * does not have IOMMU, there's no need to enable this feature > > >> + * which may cause unnecessary IOTLB miss/update transactions. > > >> + */ > > >> +if (vdev) { > > >> +return virtio_bus_device_iommu_enabled(vdev) && > > >> +virtio_host_has_feature(vdev, VIRTIO_F_IOMMU_PLATFORM); > > >> +} else { > > >> +return false; > > >> +} > > >> +} > > >> + > > >> static int vhost_sync_dirty_bitmap(struct vhost_dev *dev, > > >> MemoryRegionSection *section, > > >> hwaddr first, > > >> hwaddr last) > > >> { > > >> +IOMMUTLBEntry iotlb; > > > why don't we move this inside the scope where it's used? > > That's fine. > > > > > > >> int i; > > >> hwaddr start_addr; > > >> hwaddr end_addr; > > >> @@ -132,13 +151,37 @@ static int vhost_sync_dirty_bitmap(struct > > >> vhost_dev *dev, > > >> } > > >> for (i = 0; i < dev->nvqs; ++i) { > > >> struct vhost_virtqueue *vq = dev->vqs + i; > > >> +hwaddr used_phys = vq->used_phys, used_size = vq->used_size; > > >> +hwaddr phys, s; > > > these two, too. > > Right. > > > > > > >> > > >> if (!vq->used_phys && !vq->used_size) { > > >> continue; > > >> } > > >> > > >> -vhost_dev_sync_region(dev, section, start_addr, end_addr, > > >> vq->used_phys, > > >> -
Re: [PATCH for 7.2?] vhost: fix vq dirt bitmap syncing when vIOMMU is enabled
On Wed, Nov 23, 2022 at 12:28 AM Eric Auger wrote: > > Hi, > > On 11/22/22 10:43, Michael S. Tsirkin wrote: > > On Tue, Nov 22, 2022 at 11:01:11AM +0800, Jason Wang wrote: > >> When vIOMMU is enabled, the vq->used_phys is actually the IOVA not > >> GPA. So we need to translate it to GPA before the syncing otherwise we > >> may hit the following crash since IOVA could be out of the scope of > >> the GPA log size. This could be noted when using virtio-IOMMU with > >> vhost using 1G memory. > > Noted how exactly? What does "using 1G memory" mean? > > We hit the following assertion: > > qemu-system-x86_64: ../hw/virtio/vhost.c:85: vhost_dev_sync_region: Assertion > `end / VHOST_LOG_CHUNK < dev->log_size' failed. > > On the last time vhost_get_log_size() is called it takes into account 2 > regions when computing the log_size: > qemu-system-x86_64: vhost_get_log_size region 0 last=0x9 updated > log_size=0x3 > qemu-system-x86_64: vhost_get_log_size region 1 last=0x3fff updated > log_size=0x1000 > so in vhost_migration_log() vhost_get_log_size(dev) returns 0x1000 > > In the test case, memory_region_sync_dirty_bitmap() gets called for > mem-machine_mem, vga.vram (several times) and eventually on pc.bios. This > latter is reponsible for the assertion: > > qemu-system-x86_64: vhost_log_sync calls sync_dirty_map on pc.bios for the > full range > qemu-system-x86_64: vhost_sync_dirty_bitmap calls vhost_dev_sync_region on > region 0 > qemu-system-x86_64: vhost_dev_sync_region end=0x9 < start=0xfffc > qemu-system-x86_64: vhost_sync_dirty_bitmap calls vhost_dev_sync_region on > region 1 > qemu-system-x86_64: vhost_dev_sync_region end=0x3fff < start=0xfffc > qemu-system-x86_64: vhost_sync_dirty_bitmap calls vhost_dev_sync_region on vq > 0 <- > qemu-system-x86_64: vhost_dev_sync_region pc.bios mfirst=0xfffc > mlast=0x rfirst=0xf240 rlast=0xfa45 > qemu-system-x86_64: vhost_dev_sync_region pc.bios end=0xfa45 > VHOST_LOG_CHUNK=0x4 end/VHOST_LOG_CHUNK=0x3fff dev->log_size=0x1000 > qemu-system-x86_64: ../hw/virtio/vhost.c:85: vhost_dev_sync_region: Assertion > `end / VHOST_LOG_CHUNK < dev->log_size' failed. > > > > "using 1G memory": We hit the issue with a guest started with 1GB initial RAM. Yes, so in the case the guest iova allocator may try to use an IOVA that is beyond 1G. > > > > > >> Fixes: c471ad0e9bd46 ("vhost_net: device IOTLB support") > >> Cc: qemu-sta...@nongnu.org > >> Reported-by: Yalan Zhang > >> Tested-by: Eric Auger > >> Tested-by: Lei Yang > >> Signed-off-by: Jason Wang > >> --- > >> hw/virtio/vhost.c | 65 --- > >> 1 file changed, 45 insertions(+), 20 deletions(-) > >> > >> diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c > >> index d1c4c20b8c..26b319f34e 100644 > >> --- a/hw/virtio/vhost.c > >> +++ b/hw/virtio/vhost.c > >> @@ -106,11 +106,30 @@ static void vhost_dev_sync_region(struct vhost_dev > >> *dev, > >> } > >> } > >> > >> +static bool vhost_dev_has_iommu(struct vhost_dev *dev) > >> +{ > >> +VirtIODevice *vdev = dev->vdev; > >> + > >> +/* > >> + * For vhost, VIRTIO_F_IOMMU_PLATFORM means the backend support > >> + * incremental memory mapping API via IOTLB API. For platform that > >> + * does not have IOMMU, there's no need to enable this feature > >> + * which may cause unnecessary IOTLB miss/update transactions. > >> + */ > >> +if (vdev) { > >> +return virtio_bus_device_iommu_enabled(vdev) && > >> +virtio_host_has_feature(vdev, VIRTIO_F_IOMMU_PLATFORM); > >> +} else { > >> +return false; > >> +} > >> +} > >> + > >> static int vhost_sync_dirty_bitmap(struct vhost_dev *dev, > >> MemoryRegionSection *section, > >> hwaddr first, > >> hwaddr last) > >> { > >> +IOMMUTLBEntry iotlb; > > why don't we move this inside the scope where it's used? That's fine. > > > >> int i; > >> hwaddr start_addr; > >> hwaddr end_addr; > >> @@ -132,13 +151,37 @@ static int vhost_sync_dirty_bitmap(struct vhost_dev > >> *dev, > >> } > >> for (i = 0; i < dev->nvqs; ++i) { > >> struct vhost_virtqueue *vq = dev->vqs + i; > >> +hwaddr used_phys = vq->used_phys, used_size = vq->used_size; > >> +hwaddr phys, s; > > these two, too. Right. > > > >> > >> if (!vq->used_phys && !vq->used_size) { > >> continue; > >> } > >> > >> -vhost_dev_sync_region(dev, section, start_addr, end_addr, > >> vq->used_phys, > >> - range_get_last(vq->used_phys, > >> vq->used_size)); > >> +if (vhost_dev_has_iommu(dev)) { > >> +while (used_size) { > >> +rcu_read_lock(); > >> +iotlb = address_space_get_iotlb_entry(dev->vdev->dma_as, > >> +
Re: [PATCH] vhost-vdpa: skip TPM CRB memory section
On Wed, Nov 23, 2022 at 12:50 AM Eugenio Perez Martin wrote: > > On Tue, Nov 22, 2022 at 3:53 PM wrote: > > > > From: Marc-André Lureau > > > > 851d6d1a0f ("vfio/common: remove spurious tpm-crb-cmd misalignment > > warning") removed the warning on vfio_listener_region_add() path. > > > > An error is reported for vhost-vdpa case: > > qemu-kvm: vhost_vdpa_listener_region_add received unaligned region > > > > Skip the CRB device. > > > > Fixes: > > https://bugzilla.redhat.com/show_bug.cgi?id=2141965 > > > > Signed-off-by: Marc-André Lureau > > Acked-by: Eugenio Pérez > > And CCing Jason too. Adding Cindy. Acked-by: Jason Wang Thanks > > > --- > > hw/virtio/vhost-vdpa.c | 6 ++ > > 1 file changed, 6 insertions(+) > > > > diff --git a/hw/virtio/vhost-vdpa.c b/hw/virtio/vhost-vdpa.c > > index 7468e44b87..9d7206e4b8 100644 > > --- a/hw/virtio/vhost-vdpa.c > > +++ b/hw/virtio/vhost-vdpa.c > > @@ -19,6 +19,7 @@ > > #include "hw/virtio/virtio-net.h" > > #include "hw/virtio/vhost-shadow-virtqueue.h" > > #include "hw/virtio/vhost-vdpa.h" > > +#include "sysemu/tpm.h" > > #include "exec/address-spaces.h" > > #include "migration/blocker.h" > > #include "qemu/cutils.h" > > @@ -46,6 +47,11 @@ static bool > > vhost_vdpa_listener_skipped_section(MemoryRegionSection *section, > > { > > Int128 llend; > > > > +if (TPM_IS_CRB(section->mr->owner)) { > > +/* The CRB command buffer has its base address unaligned. */ > > +return true; > > +} > > + > > if ((!memory_region_is_ram(section->mr) && > > !memory_region_is_iommu(section->mr)) || > > memory_region_is_protected(section->mr) || > > -- > > 2.38.1 > > >
Re: [PATCH] target/riscv: Dump sstatus CSR in riscv_cpu_dump_state()
Hi Alistair, On Wed, Nov 23, 2022 at 8:03 AM Alistair Francis wrote: > > On Wed, Nov 23, 2022 at 2:07 AM Bin Meng wrote: > > > > sstatus register dump is currently missing in riscv_cpu_dump_state(). > > > > Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1332 > > Signed-off-by: Bin Meng > > > > --- > > > > target/riscv/cpu.c | 1 + > > 1 file changed, 1 insertion(+) > > > > diff --git a/target/riscv/cpu.c b/target/riscv/cpu.c > > index d14e95c9dc..80d76f0181 100644 > > --- a/target/riscv/cpu.c > > +++ b/target/riscv/cpu.c > > @@ -382,6 +382,7 @@ static void riscv_cpu_dump_state(CPUState *cs, FILE *f, > > int flags) > > CSR_MHARTID, > > CSR_MSTATUS, > > CSR_MSTATUSH, > > +CSR_SSTATUS, > > I don't think we need this. mstatus contains all of the information > already and there is limited space to print all of this information > out. > I am not sure what limited space restricts this? This is CPU state dump, and printing sstatus CSR seems reasonable to me. We do the similar thing in the gdb stub too. Regards, Bin
[PATCH] hw/usb: add configuration flags for emulated and passthru usb smartcard
We add two new configuration flags, USB_SMARTCARD_PASSTHRU and USB_SMARTCARD_EMULATED in order to improve configurability of these two functionalities. Signed-off-by: Jon Maloy --- hw/usb/Kconfig | 12 hw/usb/meson.build | 4 ++-- 2 files changed, 14 insertions(+), 2 deletions(-) diff --git a/hw/usb/Kconfig b/hw/usb/Kconfig index ce4f433976..50a82badd6 100644 --- a/hw/usb/Kconfig +++ b/hw/usb/Kconfig @@ -108,6 +108,18 @@ config USB_SMARTCARD bool default y depends on USB +select USB_SMARTCARD_PASSTHRU +select USB_SMARTCARD_EMULATED + +config USB_SMARTCARD_PASSTHRU +bool +default y +depends on USB + +config USB_SMARTCARD_EMULATED +bool +default y +depends on USB config USB_STORAGE_MTP bool diff --git a/hw/usb/meson.build b/hw/usb/meson.build index 793df42e21..353006fb6c 100644 --- a/hw/usb/meson.build +++ b/hw/usb/meson.build @@ -51,8 +51,8 @@ softmmu_ss.add(when: 'CONFIG_USB_SMARTCARD', if_true: files('dev-smartcard-reade if cacard.found() usbsmartcard_ss = ss.source_set() - usbsmartcard_ss.add(when: 'CONFIG_USB_SMARTCARD', - if_true: [cacard, files('ccid-card-emulated.c', 'ccid-card-passthru.c')]) + usbsmartcard_ss.add(when: 'CONFIG_USB_SMARTCARD_EMULATED', if_true: [cacard, files('ccid-card-emulated.c')]) + usbsmartcard_ss.add(when: 'CONFIG_USB_SMARTCARD_PASSTHRU', if_true: [cacard, files('ccid-card-passthru.c')]) hw_usb_modules += {'smartcard': usbsmartcard_ss} endif -- 2.35.3
Re: [PATCH v2 1/7] qemu/main-loop: Introduce QEMU_IOTHREAD_LOCK_GUARD, WITH_QEMU_IOTHREAD_LOCK
On 11/22/22 12:57, Richard Henderson wrote: +#define QEMU_IOTHREAD_LOCK_GUARD() \ +g_auto(IOThreadLockAuto) _iothread_lock_auto\ += qemu_iothread_auto_lock(__FILE__, __LINE__) \ Need an extra G_GNUC_UNUSED for (some versions of?) clang. Which is really a bug, because it's always used via the cleanup. r~
[PATCH v2] hw/loongarch: Add cfi01 pflash device
Add cfi01 pflash device for LoongArch virt machine Signed-off-by: Xiaojuan Yang --- hw/loongarch/Kconfig| 1 + hw/loongarch/acpi-build.c | 18 + hw/loongarch/virt.c | 80 - include/hw/loongarch/virt.h | 5 +++ 4 files changed, 103 insertions(+), 1 deletion(-) diff --git a/hw/loongarch/Kconfig b/hw/loongarch/Kconfig index 17d15b6c90..eb112af990 100644 --- a/hw/loongarch/Kconfig +++ b/hw/loongarch/Kconfig @@ -20,3 +20,4 @@ config LOONGARCH_VIRT select ACPI_HW_REDUCED select FW_CFG_DMA select DIMM +select PFLASH_CFI01 diff --git a/hw/loongarch/acpi-build.c b/hw/loongarch/acpi-build.c index 7d5f5a757d..bc5f5fe9dc 100644 --- a/hw/loongarch/acpi-build.c +++ b/hw/loongarch/acpi-build.c @@ -279,6 +279,23 @@ static void build_pci_device_aml(Aml *scope, LoongArchMachineState *lams) acpi_dsdt_add_gpex(scope, ); } +static void build_flash_aml(Aml *scope, LoongArchMachineState *lams) +{ +Aml *dev, *crs; + +hwaddr flash0_base = VIRT_FLASH0_BASE; +hwaddr flash0_size = VIRT_FLASH0_SIZE; + +dev = aml_device("FLS0"); +aml_append(dev, aml_name_decl("_HID", aml_string("LNRO0015"))); +aml_append(dev, aml_name_decl("_UID", aml_int(0))); + +crs = aml_resource_template(); +aml_append(crs, aml_memory32_fixed(flash0_base, flash0_size, AML_READ_WRITE)); +aml_append(dev, aml_name_decl("_CRS", crs)); +aml_append(scope, dev); +} + #ifdef CONFIG_TPM static void acpi_dsdt_add_tpm(Aml *scope, LoongArchMachineState *vms) { @@ -328,6 +345,7 @@ build_dsdt(GArray *table_data, BIOSLinker *linker, MachineState *machine) build_uart_device_aml(dsdt); build_pci_device_aml(dsdt, lams); build_la_ged_aml(dsdt, machine); +build_flash_aml(dsdt, lams); #ifdef CONFIG_TPM acpi_dsdt_add_tpm(dsdt, lams); #endif diff --git a/hw/loongarch/virt.c b/hw/loongarch/virt.c index 958be74fa1..ded77a2f3e 100644 --- a/hw/loongarch/virt.c +++ b/hw/loongarch/virt.c @@ -42,6 +42,74 @@ #include "hw/display/ramfb.h" #include "hw/mem/pc-dimm.h" #include "sysemu/tpm.h" +#include "sysemu/block-backend.h" +#include "hw/block/flash.h" + +static PFlashCFI01 *virt_flash_create1(LoongArchMachineState *lams, + const char *name, + const char *alias_prop_name) +{ +DeviceState *dev = qdev_new(TYPE_PFLASH_CFI01); + +qdev_prop_set_uint64(dev, "sector-length", VIRT_FLASH_SECTOR_SIZE); +qdev_prop_set_uint8(dev, "width", 4); +qdev_prop_set_uint8(dev, "device-width", 2); +qdev_prop_set_bit(dev, "big-endian", false); +qdev_prop_set_uint16(dev, "id0", 0x89); +qdev_prop_set_uint16(dev, "id1", 0x18); +qdev_prop_set_uint16(dev, "id2", 0x00); +qdev_prop_set_uint16(dev, "id3", 0x00); +qdev_prop_set_string(dev, "name", name); +object_property_add_child(OBJECT(lams), name, OBJECT(dev)); +object_property_add_alias(OBJECT(lams), alias_prop_name, + OBJECT(dev), "drive"); +return PFLASH_CFI01(dev); +} + +static void virt_flash_create(LoongArchMachineState *lams) +{ +lams->flash[0] = virt_flash_create1(lams, "virt.flash0", "pflash0"); +} + +static void virt_flash_map1(PFlashCFI01 *flash, +hwaddr base, hwaddr size, +MemoryRegion *sysmem) +{ +DeviceState *dev = DEVICE(flash); + +assert(QEMU_IS_ALIGNED(size, VIRT_FLASH_SECTOR_SIZE)); +assert(size / VIRT_FLASH_SECTOR_SIZE <= UINT32_MAX); + +qdev_prop_set_uint32(dev, "num-blocks", size / VIRT_FLASH_SECTOR_SIZE); +sysbus_realize_and_unref(SYS_BUS_DEVICE(dev), _fatal); +memory_region_add_subregion(sysmem, base, +sysbus_mmio_get_region(SYS_BUS_DEVICE(dev), 0)); +} + +static void virt_flash_map(LoongArchMachineState *lams, + MemoryRegion *sysmem) +{ +PFlashCFI01 *flash0 = lams->flash[0]; + +virt_flash_map1(flash0, VIRT_FLASH0_BASE, VIRT_FLASH0_SIZE, sysmem); +} + +static void fdt_add_flash_node(LoongArchMachineState *lams) +{ +MachineState *ms = MACHINE(lams); +char *nodename; + +hwaddr flash0_base = VIRT_FLASH0_BASE; +hwaddr flash0_size = VIRT_FLASH0_SIZE; + +nodename = g_strdup_printf("/flash@%" PRIx64, flash0_base); +qemu_fdt_add_subnode(ms->fdt, nodename); +qemu_fdt_setprop_string(ms->fdt, nodename, "compatible", "cfi-flash"); +qemu_fdt_setprop_sized_cells(ms->fdt, nodename, "reg", + 2, flash0_base, 2, flash0_size); +qemu_fdt_setprop_cell(ms->fdt, nodename, "bank-width", 4); +g_free(nodename); +} static void fdt_add_rtc_node(LoongArchMachineState *lams) { @@ -593,9 +661,17 @@ static void loongarch_firmware_init(LoongArchMachineState *lams) { char *filename = MACHINE(lams)->firmware; char *bios_name = NULL; -int bios_size; +int bios_size, i; lams->bios_loaded = false; +/* Map legacy
[PATCH v2] vfio/pci: Verify each MSI vector to avoid invalid MSI vectors
From: Xiang Chen Currently the number of MSI vectors comes from register PCI_MSI_FLAGS which should be power-of-2 in qemu, in some scenaries it is not the same as the number that driver requires in guest, for example, a PCI driver wants to allocate 6 MSI vecotrs in guest, but as the limitation, it will allocate 8 MSI vectors. So it requires 8 MSI vectors in qemu while the driver in guest only wants to allocate 6 MSI vectors. When GICv4.1 is enabled, it iterates over all possible MSIs and enable the forwarding while the guest has only created some of mappings in the virtual ITS, so some calls fail. The exception print is as following: vfio-pci :3a:00.1: irq bypass producer (token 8f08224d) registration fails:66311 To avoid the issue, verify each MSI vector, skip some operations such as request_irq() and irq_bypass_register_producer() for those invalid MSI vectors. Signed-off-by: Xiang Chen --- I reported the issue at the link: https://lkml.kernel.org/lkml/87cze9lcut.wl-...@kernel.org/T/ Change Log: v1 -> v2: Verify each MSI vector in kernel instead of adding systemcall according to Mar's suggestion --- arch/arm64/kvm/vgic/vgic-irqfd.c | 13 + arch/arm64/kvm/vgic/vgic-its.c| 36 arch/arm64/kvm/vgic/vgic.h| 1 + drivers/vfio/pci/vfio_pci_intrs.c | 33 + include/linux/kvm_host.h | 2 ++ 5 files changed, 85 insertions(+) diff --git a/arch/arm64/kvm/vgic/vgic-irqfd.c b/arch/arm64/kvm/vgic/vgic-irqfd.c index 475059b..71f6af57 100644 --- a/arch/arm64/kvm/vgic/vgic-irqfd.c +++ b/arch/arm64/kvm/vgic/vgic-irqfd.c @@ -98,6 +98,19 @@ int kvm_set_msi(struct kvm_kernel_irq_routing_entry *e, return vgic_its_inject_msi(kvm, ); } +int kvm_verify_msi(struct kvm *kvm, + struct kvm_kernel_irq_routing_entry *irq_entry) +{ + struct kvm_msi msi; + + if (!vgic_has_its(kvm)) + return -ENODEV; + + kvm_populate_msi(irq_entry, ); + + return vgic_its_verify_msi(kvm, ); +} + /** * kvm_arch_set_irq_inatomic: fast-path for irqfd injection */ diff --git a/arch/arm64/kvm/vgic/vgic-its.c b/arch/arm64/kvm/vgic/vgic-its.c index 94a666d..8312a4a 100644 --- a/arch/arm64/kvm/vgic/vgic-its.c +++ b/arch/arm64/kvm/vgic/vgic-its.c @@ -767,6 +767,42 @@ int vgic_its_inject_cached_translation(struct kvm *kvm, struct kvm_msi *msi) return 0; } +int vgic_its_verify_msi(struct kvm *kvm, struct kvm_msi *msi) +{ + struct vgic_its *its; + struct its_ite *ite; + struct kvm_vcpu *vcpu; + int ret = 0; + + if (!irqchip_in_kernel(kvm) || (msi->flags & ~KVM_MSI_VALID_DEVID)) + return -EINVAL; + + if (!vgic_has_its(kvm)) + return -ENODEV; + + its = vgic_msi_to_its(kvm, msi); + if (IS_ERR(its)) + return PTR_ERR(its); + + mutex_lock(>its_lock); + if (!its->enabled) { + ret = -EBUSY; + goto unlock; + } + ite = find_ite(its, msi->devid, msi->data); + if (!ite || !its_is_collection_mapped(ite->collection)) { + ret = E_ITS_INT_UNMAPPED_INTERRUPT; + goto unlock; + } + + vcpu = kvm_get_vcpu(kvm, ite->collection->target_addr); + if (!vcpu) + ret = E_ITS_INT_UNMAPPED_INTERRUPT; +unlock: + mutex_unlock(>its_lock); + return ret; +} + /* * Queries the KVM IO bus framework to get the ITS pointer from the given * doorbell address. diff --git a/arch/arm64/kvm/vgic/vgic.h b/arch/arm64/kvm/vgic/vgic.h index 0c8da72..d452150 100644 --- a/arch/arm64/kvm/vgic/vgic.h +++ b/arch/arm64/kvm/vgic/vgic.h @@ -240,6 +240,7 @@ int kvm_vgic_register_its_device(void); void vgic_enable_lpis(struct kvm_vcpu *vcpu); void vgic_flush_pending_lpis(struct kvm_vcpu *vcpu); int vgic_its_inject_msi(struct kvm *kvm, struct kvm_msi *msi); +int vgic_its_verify_msi(struct kvm *kvm, struct kvm_msi *msi); int vgic_v3_has_attr_regs(struct kvm_device *dev, struct kvm_device_attr *attr); int vgic_v3_dist_uaccess(struct kvm_vcpu *vcpu, bool is_write, int offset, u32 *val); diff --git a/drivers/vfio/pci/vfio_pci_intrs.c b/drivers/vfio/pci/vfio_pci_intrs.c index 40c3d7c..3027805 100644 --- a/drivers/vfio/pci/vfio_pci_intrs.c +++ b/drivers/vfio/pci/vfio_pci_intrs.c @@ -19,6 +19,7 @@ #include #include #include +#include #include "vfio_pci_priv.h" @@ -315,6 +316,28 @@ static int vfio_msi_enable(struct vfio_pci_core_device *vdev, int nvec, bool msi return 0; } +static int vfio_pci_verify_msi_entry(struct vfio_pci_core_device *vdev, + struct eventfd_ctx *trigger) +{ + struct kvm *kvm = vdev->vdev.kvm; + struct kvm_kernel_irqfd *tmp; + struct kvm_kernel_irq_routing_entry irq_entry; + int ret = -ENODEV; + + spin_lock_irq(>irqfds.lock); + list_for_each_entry(tmp, >irqfds.items, list) { +
RE: [PATCH V2 0/4] hw/block/nvme: Implement ZNS finish-zone ZDC AEN
> From: Klaus Jensen > Sent: Wednesday, November 9, 2022 12:40 AM > > On Oct 21 16:10, clay.may...@kioxia.com wrote: > > From: Clay Mayers > > > > ZNS controllers have the option to limit the time a zone can remain in > > the active state. It begins with a background process in the controller > > setting the finish-zone-recommended FZR attribute for a zone. As part of > > setting this attribute, the zone's id is added to the namespace's > > zone-descriptor-changed (ZDC) log page. If enabled, items added to the > > ZDC log page generate a ZDC "asynchronous event notification" AEN. > Optionally, > > the control can induce a "zone excursion" forcing the zone into the finished > > state that also generates a ZDC event. > > > > Zone enabled applications need to properly handle ZDC events. In a real > device, > > the timeout is many hours making testing an application difficult. > > Implemented is the generation of FZR ZDC events to speed up O/S and > application > > testing. > > > > Added to the zoned NVMe command set is an optional, per-namespace timer > > (zoned.finish_time) to set the FZR attr for long-lived active zones; A per > > namespace ZDC log page; AEN results to including CQE.DW1 (the NSID of the > ZDC > > AEN) and generating a ZDC AEN if it's been enabled. Zone excursions are not > > modeled. > > > > See section 5.5 of the NVMe Zoned Namespace Command Set Specification > v1.1 > > for more details. > > > > Changes since v1 > > - Fixed offset length checking in zdc log page > > - Moved zdc_event_queued to the patch 4 > > - Unwatched zdc events in nvme_exit() > > > > Clay Mayers (4): > > hw/block/nvme: add ZONE_FINISH_RECOMMENDED functionality > > hw/block/nvme: add zone descriptor changed log page > > hw/block/nvme: supply dw1 for aen result > > hw/block/nvme: add zone descriptor changed AEN > > > > docs/system/devices/nvme.rst | 5 + > > hw/nvme/ctrl.c | 174 +-- > > hw/nvme/ns.c | 15 +++ > > hw/nvme/nvme.h | 37 +++- > > hw/nvme/trace-events | 3 +- > > include/block/nvme.h | 14 ++- > > 6 files changed, 233 insertions(+), 15 deletions(-) > > > > -- > > 2.27.0 > > > > Nice work Clay! Thanks! I think you had implemented excursions at one point and I'm not sure why it didn't make it in. I originally left out excursions because all I needed was the AEN to test my Linux patch. I have a V3 in the works with excursions so I can test what happens to libzbd applications. I can't tell if there is real interest in this but I've tested rocksDB using zenfs and it has difficulties with zone excursions. What I don't know is if excursions are a corner case or not - in my world, they are not. > > Series looks good to me, > > Reviewed-by: Klaus Jensen
Re: [PATCH] target/riscv: Dump sstatus CSR in riscv_cpu_dump_state()
On Wed, Nov 23, 2022 at 2:07 AM Bin Meng wrote: > > sstatus register dump is currently missing in riscv_cpu_dump_state(). > > Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1332 > Signed-off-by: Bin Meng > > --- > > target/riscv/cpu.c | 1 + > 1 file changed, 1 insertion(+) > > diff --git a/target/riscv/cpu.c b/target/riscv/cpu.c > index d14e95c9dc..80d76f0181 100644 > --- a/target/riscv/cpu.c > +++ b/target/riscv/cpu.c > @@ -382,6 +382,7 @@ static void riscv_cpu_dump_state(CPUState *cs, FILE *f, > int flags) > CSR_MHARTID, > CSR_MSTATUS, > CSR_MSTATUSH, > +CSR_SSTATUS, I don't think we need this. mstatus contains all of the information already and there is limited space to print all of this information out. Alistair > CSR_HSTATUS, > CSR_VSSTATUS, > CSR_MIP, > -- > 2.34.1 > >
[PATCH v2] target/arm: align exposed ID registers with Linux
In CPUID registers exposed to userspace, some registers were missing and some fields were not exposed. This patch aligns exposed ID registers and their fields with what the upstream kernel currently exposes. Specifically, the following new ID registers/fields are exposed to userspace: ID_AA64PFR1_EL1.BT: bits 3-0 ID_AA64PFR1_EL1.MTE: bits 11-8 ID_AA64PFR1_EL1.SME: bits 27-24 ID_AA64ZFR0_EL1.SVEver: bits 3-0 ID_AA64ZFR0_EL1.AES: bits 7-4 ID_AA64ZFR0_EL1.BitPerm: bits 19-16 ID_AA64ZFR0_EL1.BF16: bits 23-20 ID_AA64ZFR0_EL1.SHA3: bits 35-32 ID_AA64ZFR0_EL1.SM4: bits 43-40 ID_AA64ZFR0_EL1.I8MM: bits 47-44 ID_AA64ZFR0_EL1.F32MM:bits 55-52 ID_AA64ZFR0_EL1.F64MM:bits 59-56 ID_AA64SMFR0_EL1.F32F32: bit 32 ID_AA64SMFR0_EL1.B16F32: bit 34 ID_AA64SMFR0_EL1.F16F32: bit 35 ID_AA64SMFR0_EL1.I8I32: bits 39-36 ID_AA64SMFR0_EL1.F64F64: bit 48 ID_AA64SMFR0_EL1.I16I64: bits 55-52 ID_AA64SMFR0_EL1.FA64:bit 63 ID_AA64MMFR0_EL1.ECV: bits 63-60 ID_AA64MMFR1_EL1.AFP: bits 47-44 ID_AA64MMFR2_EL1.AT: bits 35-32 ID_AA64ISAR0_EL1.RNDR:bits 63-60 ID_AA64ISAR1_EL1.FRINTTS: bits 35-32 ID_AA64ISAR1_EL1.BF16:bits 47-44 ID_AA64ISAR1_EL1.DGH: bits 51-48 ID_AA64ISAR1_EL1.I8MM:bits 55-52 ID_AA64ISAR2_EL1.WFxT:bits 3-0 ID_AA64ISAR2_EL1.RPRES: bits 7-4 ID_AA64ISAR2_EL1.GPA3:bits 11-8 ID_AA64ISAR2_EL1.APA3:bits 15-12 The code is also refactored to use symbolic names for ID register fields for better readability and maintainability. Signed-off-by: Zhuojia Shen --- target/arm/helper.c | 96 + 1 file changed, 79 insertions(+), 17 deletions(-) diff --git a/target/arm/helper.c b/target/arm/helper.c index d8c8223ec3..82e45be04a 100644 --- a/target/arm/helper.c +++ b/target/arm/helper.c @@ -7823,31 +7823,89 @@ void register_cp_regs_for_features(ARMCPU *cpu) #ifdef CONFIG_USER_ONLY static const ARMCPRegUserSpaceInfo v8_user_idregs[] = { { .name = "ID_AA64PFR0_EL1", - .exported_bits = 0x000f000f00ff, - .fixed_bits= 0x0011 }, + .exported_bits = R_ID_AA64PFR0_FP_MASK | + R_ID_AA64PFR0_ADVSIMD_MASK | + R_ID_AA64PFR0_SVE_MASK | + R_ID_AA64PFR0_DIT_MASK, + .fixed_bits = (0x1 << R_ID_AA64PFR0_EL0_SHIFT) | +(0x1 << R_ID_AA64PFR0_EL1_SHIFT) }, { .name = "ID_AA64PFR1_EL1", - .exported_bits = 0x00f0 }, + .exported_bits = R_ID_AA64PFR1_BT_MASK | + R_ID_AA64PFR1_SSBS_MASK | + R_ID_AA64PFR1_MTE_MASK | + R_ID_AA64PFR1_SME_MASK }, { .name = "ID_AA64PFR*_EL1_RESERVED", - .is_glob = true }, -{ .name = "ID_AA64ZFR0_EL1" }, + .is_glob = true }, +{ .name = "ID_AA64ZFR0_EL1", + .exported_bits = R_ID_AA64ZFR0_SVEVER_MASK | + R_ID_AA64ZFR0_AES_MASK | + R_ID_AA64ZFR0_BITPERM_MASK | + R_ID_AA64ZFR0_BFLOAT16_MASK | + R_ID_AA64ZFR0_SHA3_MASK | + R_ID_AA64ZFR0_SM4_MASK | + R_ID_AA64ZFR0_I8MM_MASK | + R_ID_AA64ZFR0_F32MM_MASK | + R_ID_AA64ZFR0_F64MM_MASK }, +{ .name = "ID_AA64SMFR0_EL1", + .exported_bits = R_ID_AA64SMFR0_F32F32_MASK | + R_ID_AA64SMFR0_B16F32_MASK | + R_ID_AA64SMFR0_F16F32_MASK | + R_ID_AA64SMFR0_I8I32_MASK | + R_ID_AA64SMFR0_F64F64_MASK | + R_ID_AA64SMFR0_I16I64_MASK | + R_ID_AA64SMFR0_FA64_MASK }, { .name = "ID_AA64MMFR0_EL1", - .fixed_bits= 0xff00 }, -{ .name = "ID_AA64MMFR1_EL1" }, + .exported_bits = R_ID_AA64MMFR0_ECV_MASK, + .fixed_bits = (0xf << R_ID_AA64MMFR0_TGRAN64_SHIFT) | +(0xf << R_ID_AA64MMFR0_TGRAN4_SHIFT) }, +{ .name = "ID_AA64MMFR1_EL1", + .exported_bits = R_ID_AA64MMFR1_AFP_MASK }, +{ .name = "ID_AA64MMFR2_EL1", + .exported_bits = R_ID_AA64MMFR2_AT_MASK }, { .name = "ID_AA64MMFR*_EL1_RESERVED", - .is_glob = true }, + .is_glob = true }, { .name = "ID_AA64DFR0_EL1", - .fixed_bits= 0x0006 }, -{ .name = "ID_AA64DFR1_EL1" }, + .fixed_bits = (0x6 << R_ID_AA64DFR0_DEBUGVER_SHIFT) }, +{ .name =
Re: [PATCH v2 1/7] qemu/main-loop: Introduce QEMU_IOTHREAD_LOCK_GUARD, WITH_QEMU_IOTHREAD_LOCK
On 22/11/22 21:57, Richard Henderson wrote: Create a couple of wrappers for locking/unlocking the iothread lock. Signed-off-by: Richard Henderson --- include/qemu/main-loop.h | 39 +++ 1 file changed, 39 insertions(+) +#define QEMU_IOTHREAD_LOCK_GUARD() \ +g_auto(IOThreadLockAuto) _iothread_lock_auto\ += qemu_iothread_auto_lock(__FILE__, __LINE__) \ + +#define WITH_QEMU_IOTHREAD_LOCK() \ +for (QEMU_IOTHREAD_LOCK_GUARD();\ + _iothread_lock_auto.iterate; \ + _iothread_lock_auto.iterate = false) Nice, thanks! Reviewed-by: Philippe Mathieu-Daudé
[PATCH v2 2/7] hw/mips: Use WITH_QEMU_IOTHREAD_LOCK in cpu_mips_irq_request
Reviewed-by: Paolo Bonzini Reviewed-by: Philippe Mathieu-Daudé Signed-off-by: Richard Henderson --- hw/mips/mips_int.c | 37 ++--- 1 file changed, 14 insertions(+), 23 deletions(-) diff --git a/hw/mips/mips_int.c b/hw/mips/mips_int.c index 2db5e10fe0..c22598d353 100644 --- a/hw/mips/mips_int.c +++ b/hw/mips/mips_int.c @@ -32,36 +32,27 @@ static void cpu_mips_irq_request(void *opaque, int irq, int level) MIPSCPU *cpu = opaque; CPUMIPSState *env = >env; CPUState *cs = CPU(cpu); -bool locked = false; if (irq < 0 || irq > 7) { return; } -/* Make sure locking works even if BQL is already held by the caller */ -if (!qemu_mutex_iothread_locked()) { -locked = true; -qemu_mutex_lock_iothread(); -} +WITH_QEMU_IOTHREAD_LOCK() { +if (level) { +env->CP0_Cause |= 1 << (irq + CP0Ca_IP); +} else { +env->CP0_Cause &= ~(1 << (irq + CP0Ca_IP)); +} -if (level) { -env->CP0_Cause |= 1 << (irq + CP0Ca_IP); -} else { -env->CP0_Cause &= ~(1 << (irq + CP0Ca_IP)); -} +if (kvm_enabled() && (irq == 2 || irq == 3)) { +kvm_mips_set_interrupt(cpu, irq, level); +} -if (kvm_enabled() && (irq == 2 || irq == 3)) { -kvm_mips_set_interrupt(cpu, irq, level); -} - -if (env->CP0_Cause & CP0Ca_IP_mask) { -cpu_interrupt(cs, CPU_INTERRUPT_HARD); -} else { -cpu_reset_interrupt(cs, CPU_INTERRUPT_HARD); -} - -if (locked) { -qemu_mutex_unlock_iothread(); +if (env->CP0_Cause & CP0Ca_IP_mask) { +cpu_interrupt(cs, CPU_INTERRUPT_HARD); +} else { +cpu_reset_interrupt(cs, CPU_INTERRUPT_HARD); +} } } -- 2.34.1
[PATCH v2 1/7] qemu/main-loop: Introduce QEMU_IOTHREAD_LOCK_GUARD, WITH_QEMU_IOTHREAD_LOCK
Create a couple of wrappers for locking/unlocking the iothread lock. Signed-off-by: Richard Henderson --- include/qemu/main-loop.h | 39 +++ 1 file changed, 39 insertions(+) diff --git a/include/qemu/main-loop.h b/include/qemu/main-loop.h index 3c9a9a982d..73c60a9af4 100644 --- a/include/qemu/main-loop.h +++ b/include/qemu/main-loop.h @@ -343,6 +343,45 @@ void qemu_mutex_lock_iothread_impl(const char *file, int line); */ void qemu_mutex_unlock_iothread(void); +/** + * QEMU_IOTHREAD_LOCK_GUARD + * WITH_QEMU_IOTHREAD_LOCK + * + * Wrap a block of code in a conditional qemu_mutex_{lock,unlock}_iothread. + */ +typedef struct IOThreadLockAuto { +bool locked; +bool iterate; +} IOThreadLockAuto; + +static inline IOThreadLockAuto qemu_iothread_auto_lock(const char *file, + int line) +{ +bool need = !qemu_mutex_iothread_locked(); +if (need) { +qemu_mutex_lock_iothread_impl(file, line); +} +return (IOThreadLockAuto){ .locked = need, .iterate = true }; +} + +static inline void qemu_iothread_auto_unlock(IOThreadLockAuto *l) +{ +if (l->locked) { +qemu_mutex_unlock_iothread(); +} +} + +G_DEFINE_AUTO_CLEANUP_CLEAR_FUNC(IOThreadLockAuto, qemu_iothread_auto_unlock) + +#define QEMU_IOTHREAD_LOCK_GUARD() \ +g_auto(IOThreadLockAuto) _iothread_lock_auto\ += qemu_iothread_auto_lock(__FILE__, __LINE__) \ + +#define WITH_QEMU_IOTHREAD_LOCK() \ +for (QEMU_IOTHREAD_LOCK_GUARD();\ + _iothread_lock_auto.iterate; \ + _iothread_lock_auto.iterate = false) + /* * qemu_cond_wait_iothread: Wait on condition for the main loop mutex * -- 2.34.1
[PATCH] hw/intc/arm_gicv3: Fix GICD_TYPER ITLinesNumber advertisement
The ARM GICv3 TRM describes that the ITLinesNumber field of GICD_TYPER register: "indicates the maximum SPI INTID that the GIC implementation supports" As SPI #0 is absolute IRQ #32, the max SPI INTID should have accounted for the internal 16x SGI's and 16x PPI's. However, the original GICv3 model subtracted off the SGI/PPI. Cosmetically this can be seen at OS boot (Linux) showing 32 shy of what should be there, i.e.: [0.00] GICv3: 224 SPIs implemented Though in hw/arm/virt.c, the machine is configured for 256 SPI's. ARM virt machine likely doesn't have a problem with this because the upper 32 IRQ's don't actually have anything meaningful wired. But, this does become a functional issue on a custom use case which wants to make use of these IRQ's. Additionally, boot code (i.e. TF-A) will only init up to the number (blocks of 32) that it believes to actually be there. Signed-off-by: Luke Starrett --- hw/intc/arm_gicv3_dist.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/hw/intc/arm_gicv3_dist.c b/hw/intc/arm_gicv3_dist.c index eea0368118..d599fefcbc 100644 --- a/hw/intc/arm_gicv3_dist.c +++ b/hw/intc/arm_gicv3_dist.c @@ -390,9 +390,9 @@ static bool gicd_readl(GICv3State *s, hwaddr offset, * MBIS == 0 (message-based SPIs not supported) * SecurityExtn == 1 if security extns supported * CPUNumber == 0 since for us ARE is always 1 - * ITLinesNumber == (num external irqs / 32) - 1 + * ITLinesNumber == (((max SPI IntID + 1) / 32) - 1) */ -int itlinesnumber = ((s->num_irq - GIC_INTERNAL) / 32) - 1; +int itlinesnumber = (s->num_irq / 32) - 1; /* * SecurityExtn must be RAZ if GICD_CTLR.DS == 1, and * "security extensions not supported" always implies DS == 1, -- 2.27.0
Subject: [PATCH] hw/intc/arm_gicv3: Fix GICD_TYPER ITLinesNumber advertisement
The ARM GICv3 TRM describes that the ITLinesNumber field of GICD_TYPER register: "indicates the maximum SPI INTID that the GIC implementation supports" As SPI #0 is absolute IRQ #32, the max SPI INTID should have accounted for the internal 16x SGI's and 16x PPI's. However, the original GICv3 model subtracted off the SGI/PPI. Cosmetically this can be seen at OS boot (Linux) showing 32 shy of what should be there, i.e.: [ 0.00] GICv3: 224 SPIs implemented Though in hw/arm/virt.c, the machine is configured for 256 SPI's. ARM virt machine likely doesn't have a problem with this because the upper 32 IRQ's don't actually have anything meaningful wired. But, this does become a functional issue on a custom use case which wants to make use of these IRQ's. Additionally, boot code (i.e. TF-A) will only init up to the number (blocks of 32) that it believes to actually be there. Signed-off-by: Luke Starrett --- hw/intc/arm_gicv3_dist.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/hw/intc/arm_gicv3_dist.c b/hw/intc/arm_gicv3_dist.c index eea0368118..d599fefcbc 100644 --- a/hw/intc/arm_gicv3_dist.c +++ b/hw/intc/arm_gicv3_dist.c @@ -390,9 +390,9 @@ static bool gicd_readl(GICv3State *s, hwaddr offset, * MBIS == 0 (message-based SPIs not supported) * SecurityExtn == 1 if security extns supported * CPUNumber == 0 since for us ARE is always 1 - * ITLinesNumber == (num external irqs / 32) - 1 + * ITLinesNumber == (((max SPI IntID + 1) / 32) - 1) */ - int itlinesnumber = ((s->num_irq - GIC_INTERNAL) / 32) - 1; + int itlinesnumber = (s->num_irq / 32) - 1; /* * SecurityExtn must be RAZ if GICD_CTLR.DS == 1, and * "security extensions not supported" always implies DS == 1, -- 2.27.0
[PATCH v2 4/7] target/ppc: Use QEMU_IOTHREAD_LOCK_GUARD in cpu_interrupt_exittb
In addition, use tcg_enabled instead of !kvm_enabled. Reviewed-by: Paolo Bonzini Reviewed-by: Philippe Mathieu-Daudé Reviewed-by: Daniel Henrique Barboza Signed-off-by: Richard Henderson --- target/ppc/helper_regs.c | 14 -- 1 file changed, 4 insertions(+), 10 deletions(-) diff --git a/target/ppc/helper_regs.c b/target/ppc/helper_regs.c index c0aee5855b..779e7db513 100644 --- a/target/ppc/helper_regs.c +++ b/target/ppc/helper_regs.c @@ -22,6 +22,7 @@ #include "qemu/main-loop.h" #include "exec/exec-all.h" #include "sysemu/kvm.h" +#include "sysemu/tcg.h" #include "helper_regs.h" #include "power8-pmu.h" #include "cpu-models.h" @@ -203,17 +204,10 @@ void cpu_interrupt_exittb(CPUState *cs) { /* * We don't need to worry about translation blocks - * when running with KVM. + * unless running with TCG. */ -if (kvm_enabled()) { -return; -} - -if (!qemu_mutex_iothread_locked()) { -qemu_mutex_lock_iothread(); -cpu_interrupt(cs, CPU_INTERRUPT_EXITTB); -qemu_mutex_unlock_iothread(); -} else { +if (tcg_enabled()) { +QEMU_IOTHREAD_LOCK_GUARD(); cpu_interrupt(cs, CPU_INTERRUPT_EXITTB); } } -- 2.34.1
[PATCH v2 0/7] main-loop: Introduce QEMU_IOTHREAD_LOCK_GUARD
Simplify the usage of qemu_mutex_lock_iothread. Split out for ease of review. Changes for v2: * Add WITH_QEMU_IOTHREAD_LOCK and use it a couple of places. This re-implements patch 1, so r-b's dropped. r~ Richard Henderson (7): qemu/main-loop: Introduce QEMU_IOTHREAD_LOCK_GUARD, WITH_QEMU_IOTHREAD_LOCK hw/mips: Use WITH_QEMU_IOTHREAD_LOCK in cpu_mips_irq_request target/ppc: Use QEMU_IOTHREAD_LOCK_GUARD in ppc_maybe_interrupt target/ppc: Use QEMU_IOTHREAD_LOCK_GUARD in cpu_interrupt_exittb hw/ppc: Use QEMU_IOTHREAD_LOCK_GUARD in ppc_set_irq target/riscv: Use QEMU_IOTHREAD_LOCK_GUARD in riscv_cpu_update_mip accel/tcg: Use WITH_QEMU_IOTHREAD_LOCK in io_readx/io_writex include/qemu/main-loop.h | 39 +++ accel/tcg/cputlb.c| 23 ++- hw/mips/mips_int.c| 37 ++--- hw/ppc/ppc.c | 10 +- target/ppc/excp_helper.c | 11 +-- target/ppc/helper_regs.c | 14 -- target/riscv/cpu_helper.c | 22 +++--- 7 files changed, 72 insertions(+), 84 deletions(-) -- 2.34.1
Re: Subject: [PATCH] hw/intc/arm_gicv3: Fix GICD_TYPER ITLinesNumber advertisement
Please disregard this one. Sent in error. On 11/22/2022 2:18 PM, Luke Starrett wrote: The ARM GICv3 TRM describes that the ITLinesNumber field of GICD_TYPER register: "indicates the maximum SPI INTID that the GIC implementation supports" As SPI #0 is absolute IRQ #32, the max SPI INTID should have accounted for the internal 16x SGI's and 16x PPI's. However, the original GICv3 model subtracted off the SGI/PPI. Cosmetically this can be seen at OS boot (Linux) showing 32 shy of what should be there, i.e.: [ 0.00] GICv3: 224 SPIs implemented Though in hw/arm/virt.c, the machine is configured for 256 SPI's. ARM virt machine likely doesn't have a problem with this because the upper 32 IRQ's don't actually have anything meaningful wired. But, this does become a functional issue on a custom use case which wants to make use of these IRQ's. Additionally, boot code (i.e. TF-A) will only init up to the number (blocks of 32) that it believes to actually be there. Signed-off-by: Luke Starrett --- hw/intc/arm_gicv3_dist.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/hw/intc/arm_gicv3_dist.c b/hw/intc/arm_gicv3_dist.c index eea0368118..d599fefcbc 100644 --- a/hw/intc/arm_gicv3_dist.c +++ b/hw/intc/arm_gicv3_dist.c @@ -390,9 +390,9 @@ static bool gicd_readl(GICv3State *s, hwaddr offset, * MBIS == 0 (message-based SPIs not supported) * SecurityExtn == 1 if security extns supported * CPUNumber == 0 since for us ARE is always 1 - * ITLinesNumber == (num external irqs / 32) - 1 + * ITLinesNumber == (((max SPI IntID + 1) / 32) - 1) */ - int itlinesnumber = ((s->num_irq - GIC_INTERNAL) / 32) - 1; + int itlinesnumber = (s->num_irq / 32) - 1; /* * SecurityExtn must be RAZ if GICD_CTLR.DS == 1, and * "security extensions not supported" always implies DS == 1,
[PATCH v2 7/7] accel/tcg: Use WITH_QEMU_IOTHREAD_LOCK in io_readx/io_writex
Narrow the scope of the lock to the actual read/write, moving the cpu_transation_failed call outside the lock. Reviewed-by: Paolo Bonzini Reviewed-by: Philippe Mathieu-Daudé Signed-off-by: Richard Henderson --- accel/tcg/cputlb.c | 23 ++- 1 file changed, 6 insertions(+), 17 deletions(-) diff --git a/accel/tcg/cputlb.c b/accel/tcg/cputlb.c index 6f1c00682b..8b86b70c60 100644 --- a/accel/tcg/cputlb.c +++ b/accel/tcg/cputlb.c @@ -1356,7 +1356,6 @@ static uint64_t io_readx(CPUArchState *env, CPUTLBEntryFull *full, MemoryRegionSection *section; MemoryRegion *mr; uint64_t val; -bool locked = false; MemTxResult r; section = iotlb_to_section(cpu, full->xlat_section, full->attrs); @@ -1367,11 +1366,10 @@ static uint64_t io_readx(CPUArchState *env, CPUTLBEntryFull *full, cpu_io_recompile(cpu, retaddr); } -if (!qemu_mutex_iothread_locked()) { -qemu_mutex_lock_iothread(); -locked = true; +WITH_QEMU_IOTHREAD_LOCK() { +r = memory_region_dispatch_read(mr, mr_offset, , op, full->attrs); } -r = memory_region_dispatch_read(mr, mr_offset, , op, full->attrs); + if (r != MEMTX_OK) { hwaddr physaddr = mr_offset + section->offset_within_address_space - @@ -1380,10 +1378,6 @@ static uint64_t io_readx(CPUArchState *env, CPUTLBEntryFull *full, cpu_transaction_failed(cpu, physaddr, addr, memop_size(op), access_type, mmu_idx, full->attrs, r, retaddr); } -if (locked) { -qemu_mutex_unlock_iothread(); -} - return val; } @@ -1410,7 +1404,6 @@ static void io_writex(CPUArchState *env, CPUTLBEntryFull *full, hwaddr mr_offset; MemoryRegionSection *section; MemoryRegion *mr; -bool locked = false; MemTxResult r; section = iotlb_to_section(cpu, full->xlat_section, full->attrs); @@ -1427,11 +1420,10 @@ static void io_writex(CPUArchState *env, CPUTLBEntryFull *full, */ save_iotlb_data(cpu, section, mr_offset); -if (!qemu_mutex_iothread_locked()) { -qemu_mutex_lock_iothread(); -locked = true; +WITH_QEMU_IOTHREAD_LOCK() { +r = memory_region_dispatch_write(mr, mr_offset, val, op, full->attrs); } -r = memory_region_dispatch_write(mr, mr_offset, val, op, full->attrs); + if (r != MEMTX_OK) { hwaddr physaddr = mr_offset + section->offset_within_address_space - @@ -1441,9 +1433,6 @@ static void io_writex(CPUArchState *env, CPUTLBEntryFull *full, MMU_DATA_STORE, mmu_idx, full->attrs, r, retaddr); } -if (locked) { -qemu_mutex_unlock_iothread(); -} } static inline target_ulong tlb_read_ofs(CPUTLBEntry *entry, size_t ofs) -- 2.34.1
[PATCH v2 6/7] target/riscv: Use QEMU_IOTHREAD_LOCK_GUARD in riscv_cpu_update_mip
Reviewed-by: Paolo Bonzini Reviewed-by: Philippe Mathieu-Daudé Reviewed-by: Alistair Francis Signed-off-by: Richard Henderson --- target/riscv/cpu_helper.c | 22 +++--- 1 file changed, 7 insertions(+), 15 deletions(-) diff --git a/target/riscv/cpu_helper.c b/target/riscv/cpu_helper.c index 278d163803..4c4b404dad 100644 --- a/target/riscv/cpu_helper.c +++ b/target/riscv/cpu_helper.c @@ -610,7 +610,6 @@ uint64_t riscv_cpu_update_mip(RISCVCPU *cpu, uint64_t mask, uint64_t value) CPURISCVState *env = >env; CPUState *cs = CPU(cpu); uint64_t gein, vsgein = 0, vstip = 0, old = env->mip; -bool locked = false; if (riscv_cpu_virt_enabled(env)) { gein = get_field(env->hstatus, HSTATUS_VGEIN); @@ -621,21 +620,14 @@ uint64_t riscv_cpu_update_mip(RISCVCPU *cpu, uint64_t mask, uint64_t value) mask = ((mask == MIP_VSTIP) && env->vstime_irq) ? 0 : mask; vstip = env->vstime_irq ? MIP_VSTIP : 0; -if (!qemu_mutex_iothread_locked()) { -locked = true; -qemu_mutex_lock_iothread(); -} +WITH_QEMU_IOTHREAD_LOCK() { +env->mip = (env->mip & ~mask) | (value & mask); -env->mip = (env->mip & ~mask) | (value & mask); - -if (env->mip | vsgein | vstip) { -cpu_interrupt(cs, CPU_INTERRUPT_HARD); -} else { -cpu_reset_interrupt(cs, CPU_INTERRUPT_HARD); -} - -if (locked) { -qemu_mutex_unlock_iothread(); +if (env->mip | vsgein | vstip) { +cpu_interrupt(cs, CPU_INTERRUPT_HARD); +} else { +cpu_reset_interrupt(cs, CPU_INTERRUPT_HARD); +} } return old; -- 2.34.1
[PATCH v2 5/7] hw/ppc: Use QEMU_IOTHREAD_LOCK_GUARD in ppc_set_irq
Reviewed-by: Paolo Bonzini Reviewed-by: Philippe Mathieu-Daudé Reviewed-by: Daniel Henrique Barboza Signed-off-by: Richard Henderson --- hw/ppc/ppc.c | 10 +- 1 file changed, 1 insertion(+), 9 deletions(-) diff --git a/hw/ppc/ppc.c b/hw/ppc/ppc.c index dc86c1c7db..4e816c68c7 100644 --- a/hw/ppc/ppc.c +++ b/hw/ppc/ppc.c @@ -44,13 +44,9 @@ void ppc_set_irq(PowerPCCPU *cpu, int irq, int level) { CPUPPCState *env = >env; unsigned int old_pending; -bool locked = false; /* We may already have the BQL if coming from the reset path */ -if (!qemu_mutex_iothread_locked()) { -locked = true; -qemu_mutex_lock_iothread(); -} +QEMU_IOTHREAD_LOCK_GUARD(); old_pending = env->pending_interrupts; @@ -67,10 +63,6 @@ void ppc_set_irq(PowerPCCPU *cpu, int irq, int level) trace_ppc_irq_set_exit(env, irq, level, env->pending_interrupts, CPU(cpu)->interrupt_request); - -if (locked) { -qemu_mutex_unlock_iothread(); -} } /* PowerPC 6xx / 7xx internal IRQ controller */ -- 2.34.1
[PATCH v2 3/7] target/ppc: Use QEMU_IOTHREAD_LOCK_GUARD in ppc_maybe_interrupt
Reviewed-by: Paolo Bonzini Reviewed-by: Philippe Mathieu-Daudé Reviewed-by: Daniel Henrique Barboza Signed-off-by: Richard Henderson --- target/ppc/excp_helper.c | 11 +-- 1 file changed, 1 insertion(+), 10 deletions(-) diff --git a/target/ppc/excp_helper.c b/target/ppc/excp_helper.c index 94adcb766b..8591bb3f73 100644 --- a/target/ppc/excp_helper.c +++ b/target/ppc/excp_helper.c @@ -2163,22 +2163,13 @@ static int ppc_next_unmasked_interrupt(CPUPPCState *env) void ppc_maybe_interrupt(CPUPPCState *env) { CPUState *cs = env_cpu(env); -bool locked = false; - -if (!qemu_mutex_iothread_locked()) { -locked = true; -qemu_mutex_lock_iothread(); -} +QEMU_IOTHREAD_LOCK_GUARD(); if (ppc_next_unmasked_interrupt(env)) { cpu_interrupt(cs, CPU_INTERRUPT_HARD); } else { cpu_reset_interrupt(cs, CPU_INTERRUPT_HARD); } - -if (locked) { -qemu_mutex_unlock_iothread(); -} } #if defined(TARGET_PPC64) -- 2.34.1
Re: [PATCH] multifd: Updated QAPI format for 'migrate' qemu monitor command
On 22/11/22 2:53 pm, Daniel P. Berrangé wrote: On Mon, Nov 21, 2022 at 01:40:27PM +0100, Juan Quintela wrote: Het Gala wrote: To prevent double data encoding of uris, instead of passing transport mechanisms, host address and port all together in form of a single string and writing different parsing functions, we intend the user to explicitly mention most of the parameters through the medium of qmp command itself. +{ 'struct': 'MigrateChannel', + 'data' : { +'channeltype' : 'MigrateChannelType', +'*src-addr' : 'MigrateAddress', +'dest-addr' : 'MigrateAddress', Why do we want *both* addresses? This is part of their goal to have source based routing, so that traffic exits the src host via a particular NIC. I think this patch would be better if we split it into two parts. First introduce "MigrateChannel" struct with *only* the 'dest-addr' field, and only allow one element in the list, making 'uri' optional. Basically the first patch would *only* be about switching the 'migrate' command from using a plain string to a MigrateAddress structure. A second patch would then extend it to allow multiple MigrateChannel elements, to support different destinations for each channel. A third patch would then extend it to add src-addr to attain the source based routing. A fourth patch can then deprecate the existing 'uri' field. > Thanks Daniel. This is a nice idea. I will break this patch into 4 different patches, so it would be clear to see how the QAPI design is evolving. +'*multifd-count' : 'int' } } And if we are passing a list, why do we want to pass the real number? Yeah, technically I think this field is redundant, as you can just add many entires to the 'channels' list, using the same addresses. All this field does is allow a more compact JSON arg list. I'm not sure this is neccessary, unless we're expecting huge numbers of 'channels', and even then this isn't likely to be a performance issue. > I have tried to explain this to Juan. The main purpose is, if we want to add 3 channels to one pair of interface and 4 pair of channels to another pair of interface, instead of writing the same interface 3 or 4 times, this field saves that redundancy, and I personally feel, it gives one clear representation of multifd capability. +# -> { "execute": "migrate", +# "arguments": { +# "channels": [ { 'channeltype': 'control', +# 'dest-addr': {'transport': 'socket', +#'type': 'inet', +#'host': '10.12.34.9', 'port': '1050'}}, +#{ 'channeltype': 'data', +# 'src-addr': {'transport': 'socket', +#'type': 'inet', +#'host': '10.12.34.9', +#'port': '4000', 'ipv4': 'true'}, +# 'dest-addr': { 'transport': 'socket', +# 'type': 'inet', +# 'host': '10.12.34.92', +# 'port': '1234', 'ipv4': 'true'}, +# 'multifd-count': 5 }, +#{ 'channeltype': 'data', +# 'src-addr': {'transport': 'socket', +#'type': 'inet', +#'host': '10.2.3.4', 'port': '1000'}, +# 'dest-addr': {'transport': 'socket', +# 'type': 'inet', +# 'host': '0.0.0.4', 'port': '3000'}, +# 'multifd-count': 3 } ] } } +# <- { "return": {} } +# ## { 'command': 'migrate', - 'data': {'uri': 'str', '*blk': 'bool', '*inc': 'bool', - '*detach': 'bool', '*resume': 'bool' } } + 'data': {'*uri': 'str', '*channels': ['MigrateChannel'], '*blk': 'bool', - I would not care at all about the "exec" protocol, just leave that alone in the deprecated command. Right now: * we can't move it to multifd without a lot of PAIN * there are patches on the list suggesting that what we really want is to create a file that is the size of RAM and just write all the RAM at the right place. * that would make the way to create snapshots (I don't know if anyones still wants them, much easier). * I think that the only real use of exec migration was to create snapshots, for real migration, using a socket is much, much saner. * I.e. what I mean here is that for exec migration, we need to think if we want to continue supporting it for normal migration, or only as a way to create snapshots. With regards, Daniel Thanks and regards, Het Gala
Re: [PULL 0/2] target-arm queue
Applied, thanks. Please update the changelog at https://wiki.qemu.org/ChangeLog/7.2 for any user-visible changes. signature.asc Description: PGP signature
Re: [PATCH] vhost-vdpa: skip TPM CRB memory section
On Tue, Nov 22, 2022 at 06:53:49PM +0400, marcandre.lur...@redhat.com wrote: > From: Marc-André Lureau > > 851d6d1a0f ("vfio/common: remove spurious tpm-crb-cmd misalignment > warning") removed the warning on vfio_listener_region_add() path. > > An error is reported for vhost-vdpa case: > qemu-kvm: vhost_vdpa_listener_region_add received unaligned region > > Skip the CRB device. > > Fixes: > https://bugzilla.redhat.com/show_bug.cgi?id=2141965 > > Signed-off-by: Marc-André Lureau > --- > hw/virtio/vhost-vdpa.c | 6 ++ > 1 file changed, 6 insertions(+) > > diff --git a/hw/virtio/vhost-vdpa.c b/hw/virtio/vhost-vdpa.c > index 7468e44b87..9d7206e4b8 100644 > --- a/hw/virtio/vhost-vdpa.c > +++ b/hw/virtio/vhost-vdpa.c > @@ -19,6 +19,7 @@ > #include "hw/virtio/virtio-net.h" > #include "hw/virtio/vhost-shadow-virtqueue.h" > #include "hw/virtio/vhost-vdpa.h" > +#include "sysemu/tpm.h" > #include "exec/address-spaces.h" > #include "migration/blocker.h" > #include "qemu/cutils.h" > @@ -46,6 +47,11 @@ static bool > vhost_vdpa_listener_skipped_section(MemoryRegionSection *section, > { > Int128 llend; > > +if (TPM_IS_CRB(section->mr->owner)) { > +/* The CRB command buffer has its base address unaligned. */ > +return true; > +} > + Quite a hack. We can't really keep adding dependency on random devices to vhost. And would you add hacks like this to listeners? Pls figure out what's special about this buffer. Also if this section is unaligned then doesn't it break up other aligned sections? > if ((!memory_region_is_ram(section->mr) && > !memory_region_is_iommu(section->mr)) || > memory_region_is_protected(section->mr) || > -- > 2.38.1
Re: [RFC 4/7] migration: Split save_live_pending() into state_pending_*
* Juan Quintela (quint...@redhat.com) wrote: > We split the function into to: > > - state_pending_estimate: We estimate the remaining state size without > stopping the machine. > > - state pending_exact: We calculate the exact amount of remaining > state. > > The only "device" that implements different functions for _estimate() > and _exact() is ram. > > Signed-off-by: Juan Quintela Yeh I think that's OK; I'm a little worried whether you end up calling the two functions in migration_iteration_run a lot, but still Reviewed-by: Dr. David Alan Gilbert > --- > docs/devel/migration.rst | 18 ++ > docs/devel/vfio-migration.rst | 4 ++-- > include/migration/register.h | 12 > migration/savevm.h | 8 ++-- > hw/s390x/s390-stattrib.c | 7 --- > hw/vfio/migration.c| 9 + > migration/block-dirty-bitmap.c | 11 ++- > migration/block.c | 11 ++- > migration/migration.c | 13 + > migration/ram.c| 31 --- > migration/savevm.c | 34 +- > hw/vfio/trace-events | 2 +- > migration/trace-events | 7 --- > 13 files changed, 114 insertions(+), 53 deletions(-) > > diff --git a/docs/devel/migration.rst b/docs/devel/migration.rst > index 3e9656d8e0..6f65c23b47 100644 > --- a/docs/devel/migration.rst > +++ b/docs/devel/migration.rst > @@ -482,15 +482,17 @@ An iterative device must provide: >- A ``load_setup`` function that initialises the data structures on the > destination. > > - - A ``save_live_pending`` function that is called repeatedly and must > -indicate how much more data the iterative data must save. The core > -migration code will use this to determine when to pause the CPUs > -and complete the migration. > + - A ``state_pending_exact`` function that indicates how much more > +data we must save. The core migration code will use this to > +determine when to pause the CPUs and complete the migration. > > - - A ``save_live_iterate`` function (called after ``save_live_pending`` > -when there is significant data still to be sent). It should send > -a chunk of data until the point that stream bandwidth limits tell it > -to stop. Each call generates one section. > + - A ``state_pending_estimate`` function that indicates how much more > +data we must save. When the estimated amount is smaller than the > +threshold, we call ``state_pending_exact``. > + > + - A ``save_live_iterate`` function should send a chunk of data until > +the point that stream bandwidth limits tell it to stop. Each call > +generates one section. > >- A ``save_live_complete_precopy`` function that must transmit the > last section for the device containing any remaining data. > diff --git a/docs/devel/vfio-migration.rst b/docs/devel/vfio-migration.rst > index 9ff6163c88..673057c90d 100644 > --- a/docs/devel/vfio-migration.rst > +++ b/docs/devel/vfio-migration.rst > @@ -28,7 +28,7 @@ VFIO implements the device hooks for the iterative approach > as follows: > * A ``load_setup`` function that sets up the migration region on the >destination and sets _RESUMING flag in the VFIO device state. > > -* A ``save_live_pending`` function that reads pending_bytes from the vendor > +* A ``state_pending_exact`` function that reads pending_bytes from the vendor >driver, which indicates the amount of data that the vendor driver has yet > to >save for the VFIO device. > > @@ -114,7 +114,7 @@ Live migration save path > (RUNNING, _SETUP, _RUNNING|_SAVING) >| > (RUNNING, _ACTIVE, _RUNNING|_SAVING) > - If device is active, get pending_bytes by .save_live_pending() > + If device is active, get pending_bytes by .state_pending_exact() >If total pending_bytes >= threshold_size, call .save_live_iterate() >Data of VFIO device for pre-copy phase is copied > Iterate till total pending bytes converge and are less than threshold > diff --git a/include/migration/register.h b/include/migration/register.h > index 5b5424ed8f..313b8e1c3b 100644 > --- a/include/migration/register.h > +++ b/include/migration/register.h > @@ -46,9 +46,7 @@ typedef struct SaveVMHandlers { > > /* This runs outside the iothread lock! */ > int (*save_setup)(QEMUFile *f, void *opaque); > -void (*save_live_pending)(void *opaque, uint64_t threshold_size, > - uint64_t *rest_precopy, uint64_t > *rest_postcopy); > -/* Note for save_live_pending: > +/* Note for state_pending_*: > * - res_precopy is for data which must be migrated in precopy > * phase or in stopped state, in other words - before target > * vm start > @@ -59,7 +57,13 @@ typedef
Re: [PATCH] multifd: Updated QAPI format for 'migrate' qemu monitor command
On 21/11/22 6:10 pm, Juan Quintela wrote: Het Gala wrote: To prevent double data encoding of uris, instead of passing transport mechanisms, host address and port all together in form of a single string and writing different parsing functions, we intend the user to explicitly mention most of the parameters through the medium of qmp command itself. Hi 1st of all, I can't see how this is 7.1 material, I guess we need to move it to 8.0. > Thankyou for informing. I will change it to 8.0 +## +# @MigrateTransport: +# +# The supported communication transport mechanisms for migration +# +# @socket: Supported communication type between two devices for migration. +# Socket is able to cover all of 'tcp', 'unix', 'vsock' and +# 'fd' already +# +# @exec: Supported communication type to redirect migration stream into file. +# +# Since 7.1 +## +{ 'enum': 'MigrateTransport', + 'data': ['socket', 'exec'] } I haven't looked too much into this, but as Danield told in the past, I can see where the rdma falls into this scheme. I guess it is going to be its own, but who knows. >So do you mean, 'data' : ['socket', 'exec', 'rdma'] ? or it will be separately represented +# The supported options for migration channel type requests +# +# @control: Support request for main outbound migration control channel +# +# @data: Supported Channel type for multifd data channels +# +# @async: Supported Channel type for post-copy async requests +# +# Since 7.1 +## +{ 'enum': 'MigrateChannelType', + 'data': ['control', 'data', 'async'] } + 'data': ['main', 'data', 'ram-async'] } ??? I don't like the 'control' name because without multifd we still pass everything through it. And with multifd, we still pass all devices through it. > I agree with you Juan. 'main' seams a better name here. Thanks for the suggestion :) About the asynchronous channel, I don't know if calling it postcopy is better. > Okay, noted. Will change in the next patchset +{ 'struct': 'MigrateChannel', + 'data' : { +'channeltype' : 'MigrateChannelType', +'*src-addr' : 'MigrateAddress', +'dest-addr' : 'MigrateAddress', Why do we want *both* addresses? > As mentioned by Daniel, multifd right now is supported for tcp socket, and we want to make multifd migration as bi-directional migration stream. +'*multifd-count' : 'int' } } And if we are passing a list, why do we want to pass the real number? > I think multifd-count variable would be handy. Just to take a used case here, we have heavy workloads to migrate, and in our system we have a 25 Gig X 2 NIC cards available, and we want to ensure that migration does not fail due to less available network (because IO, and other processes might also consume network). By experiments, we know that per multifd channel - on an average we get around 8.5-10 Gbps. Writing the same channel atleast 3 times to cover one NIC seems again redundant. So to prevent this, we think inclusion of multifd-count would be useful there. And anyways it is an optional, so if you don't mention it, it will take into account as a single thread by default. Let me know if this is convincing or we can discuss more on this ? +# -> { "execute": "migrate", +# "arguments": { +# "channels": [ { 'channeltype': 'control', +# 'dest-addr': {'transport': 'socket', +#'type': 'inet', +#'host': '10.12.34.9', 'port': '1050'}}, +#{ 'channeltype': 'data', +# 'src-addr': {'transport': 'socket', +#'type': 'inet', +#'host': '10.12.34.9', +#'port': '4000', 'ipv4': 'true'}, +# 'dest-addr': { 'transport': 'socket', +# 'type': 'inet', +# 'host': '10.12.34.92', +# 'port': '1234', 'ipv4': 'true'}, +# 'multifd-count': 5 }, +#{ 'channeltype': 'data', +# 'src-addr': {'transport': 'socket', +#'type': 'inet', +#'host': '10.2.3.4', 'port': '1000'}, +# 'dest-addr': {'transport': 'socket', +# 'type': 'inet', +# 'host': '0.0.0.4', 'port': '3000'}, +# 'multifd-count': 3 } ] } } +# <- { "return": {} } +# ## { 'command': 'migrate', - 'data': {'uri': 'str', '*blk': 'bool', '*inc': 'bool', - '*detach': 'bool', '*resume': 'bool' } } + 'data': {'*uri': 'str', '*channels': ['MigrateChannel'], '*blk': 'bool', I think that "uri" bit should be dropped, right? > yes, it will be dropped at a later point, right
Re: [PATCH] target/arm: align exposed ID registers with Linux
On 11/22/2022 10:12 AM -0800, Richard Henderson wrote: > On 11/21/22 18:48, Zhuojia Shen wrote: > > In CPUID registers exposed to userspace, some registers were missing > > and some fields were not exposed. This patch aligns exposed ID > > registers and their fields with what Linux exposes currently. > > > > Specifically, the following new ID registers/fields are exposed to > > userspace: > > > > ID_AA64PFR1_EL1.BT: bits 3-0 > > ID_AA64PFR1_EL1.MTE: bits 11-8 > > ID_AA64PFR1_EL1.SME: bits 27-24 > > > > ID_AA64ZFR0_EL1.SVEver: bits 3-0 > > ID_AA64ZFR0_EL1.AES: bits 7-4 > > ID_AA64ZFR0_EL1.BitPerm: bits 19-16 > > ID_AA64ZFR0_EL1.BF16: bits 23-20 > > ID_AA64ZFR0_EL1.SHA3: bits 35-32 > > ID_AA64ZFR0_EL1.SM4: bits 43-40 > > ID_AA64ZFR0_EL1.I8MM: bits 47-44 > > ID_AA64ZFR0_EL1.F32MM:bits 55-52 > > ID_AA64ZFR0_EL1.F64MM:bits 59-56 > > > > ID_AA64SMFR0_EL1.F32F32: bit 32 > > ID_AA64SMFR0_EL1.B16F32: bit 34 > > ID_AA64SMFR0_EL1.F16F32: bit 35 > > ID_AA64SMFR0_EL1.I8I32: bits 39-36 > > ID_AA64SMFR0_EL1.F64F64: bit 48 > > ID_AA64SMFR0_EL1.I16I64: bits 55-52 > > ID_AA64SMFR0_EL1.FA64:bit 63 > > > > ID_AA64MMFR0_EL1.ECV: bits 63-60 > > > > ID_AA64MMFR1_EL1.AFP: bits 47-44 > > > > ID_AA64MMFR2_EL1.AT: bits 35-32 > > > > ID_AA64ISAR0_EL1.RNDR:bits 63-60 > > > > ID_AA64ISAR1_EL1.FRINTTS: bits 35-32 > > ID_AA64ISAR1_EL1.BF16:bits 47-44 > > ID_AA64ISAR1_EL1.DGH: bits 51-48 > > ID_AA64ISAR1_EL1.I8MM:bits 55-52 > > > > ID_AA64ISAR2_EL1.WFxT:bits 3-0 > > ID_AA64ISAR2_EL1.RPRES: bits 7-4 > > ID_AA64ISAR2_EL1.GPA3:bits 11-8 > > ID_AA64ISAR2_EL1.APA3:bits 15-12 > > > > Signed-off-by: Zhuojia Shen > > --- > > target/arm/helper.c | 19 ++- > > 1 file changed, 14 insertions(+), 5 deletions(-) > > > > diff --git a/target/arm/helper.c b/target/arm/helper.c > > index d8c8223ec3..ce6fd7a96d 100644 > > --- a/target/arm/helper.c > > +++ b/target/arm/helper.c > > @@ -7826,13 +7826,20 @@ void register_cp_regs_for_features(ARMCPU *cpu) > > .exported_bits = 0x000f000f00ff, > > .fixed_bits= 0x0011 }, > > { .name = "ID_AA64PFR1_EL1", > > - .exported_bits = 0x00f0 }, > > + .exported_bits = 0x0f000fff }, > > Existing, but I think it would be nicer to do this symbolically. e.g. > >.exported_bits = R_ID_AA64PFR1_BT_MASK | > R_ID_AA64PFR1_SBSS_MASK | > R_ID_AA64PFR1_MTE_MASK | > R_ID_AA64PFR1_SME_MASK, > > etc. It would be more readable but longer. I can try to refactor this way. > > > r~
Re: [PATCH] target/arm: align exposed ID registers with Linux
On 11/22/2022 06:26 PM +, Peter Maydell wrote: > On Tue, 22 Nov 2022 at 03:05, Zhuojia Shen > wrote: > > > > In CPUID registers exposed to userspace, some registers were missing > > and some fields were not exposed. This patch aligns exposed ID > > registers and their fields with what Linux exposes currently. > > > > Specifically, the following new ID registers/fields are exposed to > > userspace: > > These changes don't quite seem to line up with what the kernel > documents that it exposes: > https://www.kernel.org/doc/Documentation/arm64/cpu-feature-registers.rst This kernel document seems not quite up-to-date. The patch is based on the upstream code: https://github.com/torvalds/linux/blob/master/arch/arm64/kernel/cpufeature.c > > > ID_AA64PFR1_EL1.BT: bits 3-0 > > ID_AA64PFR1_EL1.MTE: bits 11-8 > > ID_AA64PFR1_EL1.SME: bits 27-24 > > .SME not listed as exposed. > > > ID_AA64SMFR0_EL1.F32F32: bit 32 > > ID_AA64SMFR0_EL1.B16F32: bit 34 > > ID_AA64SMFR0_EL1.F16F32: bit 35 > > ID_AA64SMFR0_EL1.I8I32: bits 39-36 > > ID_AA64SMFR0_EL1.F64F64: bit 48 > > ID_AA64SMFR0_EL1.I16I64: bits 55-52 > > ID_AA64SMFR0_EL1.FA64:bit 63 > > This register not listed as exposed. This register and ID_AA64PFR1_EL1.SME are exposed since v5.19. > > > ID_AA64ISAR2_EL1.WFxT:bits 3-0 > > ID_AA64ISAR2_EL1.RPRES: bits 7-4 > > ID_AA64ISAR2_EL1.GPA3:bits 11-8 > > ID_AA64ISAR2_EL1.APA3:bits 15-12 > > GPA3 and APA3 not listed as exposed. These two are exposed since v5.18. > > thanks > -- PMM
Re: [PATCH 4/4] be less conservative with the 64bit pci io window
On Mon, Nov 21, 2022 at 11:32:13AM +0100, Gerd Hoffmann wrote: > Current seabios code will only enable and use the 64bit pci io window in > case it runs out of space in the 32bit pci mmio window below 4G. > > This patch will also enable the 64bit pci io window when > (a) RAM above 4G is present, and > (b) the physical address space size is known, and > (c) seabios is running on a 64bit capable processor. > > This operates with the assumption that guests which are ok with memory > above 4G most likely can handle mmio above 4G too. Thanks. In general, the series looks good to me. Can you elaborate on the background to this change though? It sounds like there is a (small) risk of a regression, so I think it would be good to have a high level understanding of what is driving this memory reorg. Cheers, -Kevin > > In case the 64bit pci io window is enabled also assign more memory to > prefetchable pci bridge windows (scale with address space). > > Signed-off-by: Gerd Hoffmann > --- > src/fw/pciinit.c | 8 +++- > 1 file changed, 7 insertions(+), 1 deletion(-) > > diff --git a/src/fw/pciinit.c b/src/fw/pciinit.c > index ad6def93633b..3e9636b139a4 100644 > --- a/src/fw/pciinit.c > +++ b/src/fw/pciinit.c > @@ -51,6 +51,7 @@ u64 pcimem_end = BUILD_PCIMEM_END; > u64 pcimem64_start = BUILD_PCIMEM64_START; > u64 pcimem64_end = BUILD_PCIMEM64_END; > u64 pci_io_low_end = 0xa000; > +u32 pci_use_64bit = 0; > > struct pci_region_entry { > struct pci_device *dev; > @@ -920,6 +921,8 @@ static int pci_bios_check_devices(struct pci_bus *busses) > for (type = 0; type < PCI_REGION_TYPE_COUNT; type++) { > u64 align = (type == PCI_REGION_TYPE_IO) ? > PCI_BRIDGE_IO_MIN : PCI_BRIDGE_MEM_MIN; > +if (pci_use_64bit && (type == PCI_REGION_TYPE_PREFMEM)) > +align = (u64)1 << (PhysBits - 11); > if (!pci_bridge_has_region(s->bus_dev, type)) > continue; > u64 size = 0; > @@ -1108,7 +,7 @@ static void pci_bios_map_devices(struct pci_bus *busses) > panic("PCI: out of I/O address space\n"); > > dprintf(1, "PCI: 32: %016llx - %016llx\n", pcimem_start, pcimem_end); > -if (pci_bios_init_root_regions_mem(busses)) { > +if (pci_use_64bit || pci_bios_init_root_regions_mem(busses)) { > struct pci_region r64_mem, r64_pref; > r64_mem.list.first = NULL; > r64_pref.list.first = NULL; > @@ -1174,6 +1177,9 @@ pci_setup(void) > > dprintf(3, "pci setup\n"); > > +if (PhysBits >= 36 && LongMode && RamSizeOver4G) > +pci_use_64bit = 1; > + > dprintf(1, "=== PCI bus & bridge init ===\n"); > if (pci_probe_host() != 0) { > return; > -- > 2.38.1 > >
Re: [PATCH for-8.0 15/29] include/qemu/int128: Add vector type to Int128Alias
On 22/11/22 19:21, Philippe Mathieu-Daudé wrote: On 18/11/22 10:47, Richard Henderson wrote: Adding a vector type will make it easier to handle i386 have_atomic16 via AVX. Signed-off-by: Richard Henderson --- include/qemu/int128.h | 10 +- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/include/qemu/int128.h b/include/qemu/int128.h index f62a46b48c..f29f90e6f4 100644 --- a/include/qemu/int128.h +++ b/include/qemu/int128.h @@ -479,16 +479,16 @@ static inline void bswap128s(Int128 *s) /* * When compiler supports a 128-bit type, define a combination of * a possible structure and the native types. Ease parameter passing - * via use of the transparent union extension. + * via use of the transparent union extension. Provide a vector type + * for use in atomicity on some hosts. */ -#ifdef CONFIG_INT128 typedef union { Int128 s; + uint64_t v __attribute__((vector_size(16))); +#ifdef CONFIG_INT128 __int128_t i; __uint128_t u; -} Int128Alias __attribute__((transparent_union)); -#else -typedef Int128 Int128Alias; #endif /* CONFIG_INT128 */ +} Int128Alias __attribute__((transparent_union)); #endif /* INT128_H */ This triggers a warning with GCC: Ah no, looking closer, even configured as ''--cc=gcc-12 --host-cc=gcc-12 --cxx=/bin/false', Clang got selected for ObjC, and this warning comes from it: Compiling Objective-C object libcommon.fa.p/ui_cocoa.m.o In file included from ../../ui/cocoa.m:36: In file included from include/sysemu/sysemu.h:5: In file included from include/qemu/timer.h:4: In file included from include/qemu/bitops.h:16: In file included from include/qemu/host-utils.h:35: include/qemu/int128.h:487:14: warning: alignment of field 'v' (128 bits) does not match the alignment of the first field in transparent union; transparent_union attribute ignored [-Wignored-attributes] uint64_t v __attribute__((vector_size(16))); ^ include/qemu/int128.h:486:12: note: alignment of first field is 64 bits Int128 s; ^ Meson: Project version: 7.1.91 C compiler for the host machine: gcc-12 (gcc 12.2.0 "gcc-12 (Homebrew GCC 12.2.0) 12.2.0") C linker for the host machine: gcc-12 ld64 819.6 Host machine cpu family: aarch64 Host machine cpu: arm64 Objective-C compiler for the host machine: clang (clang 14.0.0) Objective-C linker for the host machine: clang ld64 819.6 Regards, Phil.
Re: [PATCH] target/arm: align exposed ID registers with Linux
On Tue, 22 Nov 2022 at 03:05, Zhuojia Shen wrote: > > In CPUID registers exposed to userspace, some registers were missing > and some fields were not exposed. This patch aligns exposed ID > registers and their fields with what Linux exposes currently. > > Specifically, the following new ID registers/fields are exposed to > userspace: These changes don't quite seem to line up with what the kernel documents that it exposes: https://www.kernel.org/doc/Documentation/arm64/cpu-feature-registers.rst > ID_AA64PFR1_EL1.BT: bits 3-0 > ID_AA64PFR1_EL1.MTE: bits 11-8 > ID_AA64PFR1_EL1.SME: bits 27-24 .SME not listed as exposed. > ID_AA64SMFR0_EL1.F32F32: bit 32 > ID_AA64SMFR0_EL1.B16F32: bit 34 > ID_AA64SMFR0_EL1.F16F32: bit 35 > ID_AA64SMFR0_EL1.I8I32: bits 39-36 > ID_AA64SMFR0_EL1.F64F64: bit 48 > ID_AA64SMFR0_EL1.I16I64: bits 55-52 > ID_AA64SMFR0_EL1.FA64:bit 63 This register not listed as exposed. > ID_AA64ISAR2_EL1.WFxT:bits 3-0 > ID_AA64ISAR2_EL1.RPRES: bits 7-4 > ID_AA64ISAR2_EL1.GPA3:bits 11-8 > ID_AA64ISAR2_EL1.APA3:bits 15-12 GPA3 and APA3 not listed as exposed. thanks -- PMM
Re: [PATCH for-8.0 15/29] include/qemu/int128: Add vector type to Int128Alias
On 18/11/22 10:47, Richard Henderson wrote: Adding a vector type will make it easier to handle i386 have_atomic16 via AVX. Signed-off-by: Richard Henderson --- include/qemu/int128.h | 10 +- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/include/qemu/int128.h b/include/qemu/int128.h index f62a46b48c..f29f90e6f4 100644 --- a/include/qemu/int128.h +++ b/include/qemu/int128.h @@ -479,16 +479,16 @@ static inline void bswap128s(Int128 *s) /* * When compiler supports a 128-bit type, define a combination of * a possible structure and the native types. Ease parameter passing - * via use of the transparent union extension. + * via use of the transparent union extension. Provide a vector type + * for use in atomicity on some hosts. */ -#ifdef CONFIG_INT128 typedef union { Int128 s; +uint64_t v __attribute__((vector_size(16))); +#ifdef CONFIG_INT128 __int128_t i; __uint128_t u; -} Int128Alias __attribute__((transparent_union)); -#else -typedef Int128 Int128Alias; #endif /* CONFIG_INT128 */ +} Int128Alias __attribute__((transparent_union)); #endif /* INT128_H */ This triggers a warning with GCC: include/qemu/int128.h:487:14: warning: alignment of field 'v' (128 bits) does not match the alignment of the first field in transparent union; transparent_union attribute ignored [-Wignored-attributes] uint64_t v __attribute__((vector_size(16))); ^ include/qemu/int128.h:486:12: note: alignment of first field is 64 bits Int128 s; ^ Meson: Project version: 7.1.91 C compiler for the host machine: gcc-12 (gcc 12.2.0 "gcc-12 (Homebrew GCC 12.2.0) 12.2.0") C linker for the host machine: gcc-12 ld64 819.6 Host machine cpu family: aarch64 Host machine cpu: arm64
Re: [PATCH for-8.0 v3 14/45] tcg: Introduce tcg_type_size
On 11/22/22 10:14, Philippe Mathieu-Daudé wrote: I'd feel safer if we assign TCG_TYPE_I32 .. TCG_TYPE_V256 in TCGType, just in case. What do you mean? -- >8 -- diff --git a/include/tcg/tcg.h b/include/tcg/tcg.h @@ -289,8 +289,8 @@ typedef struct TCGPool { typedef enum TCGType { - TCG_TYPE_I32, - TCG_TYPE_I64, + TCG_TYPE_I32 = 0, + TCG_TYPE_I64 = 1, - TCG_TYPE_V64, - TCG_TYPE_V128, - TCG_TYPE_V256, + TCG_TYPE_V64 = 2, + TCG_TYPE_V128 = 3, + TCG_TYPE_V256 = 4, But that's what C does. I don't see the point. r~
Re: [PATCH for-8.0 v3 14/45] tcg: Introduce tcg_type_size
On 22/11/22 17:54, Richard Henderson wrote: On 11/22/22 03:30, Philippe Mathieu-Daudé wrote: On 11/11/22 08:40, Richard Henderson wrote: Add a helper function for computing the size of a type. Signed-off-by: Richard Henderson --- include/tcg/tcg.h | 16 tcg/tcg.c | 26 -- 2 files changed, 28 insertions(+), 14 deletions(-) diff --git a/include/tcg/tcg.h b/include/tcg/tcg.h index f2da340bb9..8bcd60d0ed 100644 --- a/include/tcg/tcg.h +++ b/include/tcg/tcg.h @@ -319,6 +319,22 @@ typedef enum TCGType { #endif } TCGType; +/** + * tcg_type_size + * @t: type + * + * Return the size of the type in bytes. + */ +static inline int tcg_type_size(TCGType t) +{ + unsigned i = t; + if (i >= TCG_TYPE_V64) { + tcg_debug_assert(i < TCG_TYPE_COUNT); + i -= TCG_TYPE_V64 - 1; + } + return 4 << i; I'd feel safer if we assign TCG_TYPE_I32 .. TCG_TYPE_V256 in TCGType, just in case. What do you mean? -- >8 -- diff --git a/include/tcg/tcg.h b/include/tcg/tcg.h @@ -289,8 +289,8 @@ typedef struct TCGPool { typedef enum TCGType { -TCG_TYPE_I32, -TCG_TYPE_I64, +TCG_TYPE_I32 = 0, +TCG_TYPE_I64 = 1, -TCG_TYPE_V64, -TCG_TYPE_V128, -TCG_TYPE_V256, +TCG_TYPE_V64 = 2, +TCG_TYPE_V128 = 3, +TCG_TYPE_V256 = 4, ---
Re: [PATCH] target/arm: align exposed ID registers with Linux
On 11/21/22 18:48, Zhuojia Shen wrote: In CPUID registers exposed to userspace, some registers were missing and some fields were not exposed. This patch aligns exposed ID registers and their fields with what Linux exposes currently. Specifically, the following new ID registers/fields are exposed to userspace: ID_AA64PFR1_EL1.BT: bits 3-0 ID_AA64PFR1_EL1.MTE: bits 11-8 ID_AA64PFR1_EL1.SME: bits 27-24 ID_AA64ZFR0_EL1.SVEver: bits 3-0 ID_AA64ZFR0_EL1.AES: bits 7-4 ID_AA64ZFR0_EL1.BitPerm: bits 19-16 ID_AA64ZFR0_EL1.BF16: bits 23-20 ID_AA64ZFR0_EL1.SHA3: bits 35-32 ID_AA64ZFR0_EL1.SM4: bits 43-40 ID_AA64ZFR0_EL1.I8MM: bits 47-44 ID_AA64ZFR0_EL1.F32MM:bits 55-52 ID_AA64ZFR0_EL1.F64MM:bits 59-56 ID_AA64SMFR0_EL1.F32F32: bit 32 ID_AA64SMFR0_EL1.B16F32: bit 34 ID_AA64SMFR0_EL1.F16F32: bit 35 ID_AA64SMFR0_EL1.I8I32: bits 39-36 ID_AA64SMFR0_EL1.F64F64: bit 48 ID_AA64SMFR0_EL1.I16I64: bits 55-52 ID_AA64SMFR0_EL1.FA64:bit 63 ID_AA64MMFR0_EL1.ECV: bits 63-60 ID_AA64MMFR1_EL1.AFP: bits 47-44 ID_AA64MMFR2_EL1.AT: bits 35-32 ID_AA64ISAR0_EL1.RNDR:bits 63-60 ID_AA64ISAR1_EL1.FRINTTS: bits 35-32 ID_AA64ISAR1_EL1.BF16:bits 47-44 ID_AA64ISAR1_EL1.DGH: bits 51-48 ID_AA64ISAR1_EL1.I8MM:bits 55-52 ID_AA64ISAR2_EL1.WFxT:bits 3-0 ID_AA64ISAR2_EL1.RPRES: bits 7-4 ID_AA64ISAR2_EL1.GPA3:bits 11-8 ID_AA64ISAR2_EL1.APA3:bits 15-12 Signed-off-by: Zhuojia Shen --- target/arm/helper.c | 19 ++- 1 file changed, 14 insertions(+), 5 deletions(-) diff --git a/target/arm/helper.c b/target/arm/helper.c index d8c8223ec3..ce6fd7a96d 100644 --- a/target/arm/helper.c +++ b/target/arm/helper.c @@ -7826,13 +7826,20 @@ void register_cp_regs_for_features(ARMCPU *cpu) .exported_bits = 0x000f000f00ff, .fixed_bits= 0x0011 }, { .name = "ID_AA64PFR1_EL1", - .exported_bits = 0x00f0 }, + .exported_bits = 0x0f000fff }, Existing, but I think it would be nicer to do this symbolically. e.g. .exported_bits = R_ID_AA64PFR1_BT_MASK | R_ID_AA64PFR1_SBSS_MASK | R_ID_AA64PFR1_MTE_MASK | R_ID_AA64PFR1_SME_MASK, etc. r~
[PATCH 2/3] tcg: Factor init_ffi_layouts() out of tcg_context_init()
Signed-off-by: Richard Henderson Message-Id: <2022074101.2069454-27-richard.hender...@linaro.org> [PMD: Split from bigger patch] Signed-off-by: Philippe Mathieu-Daudé --- tcg/tcg.c | 83 +-- 1 file changed, 44 insertions(+), 39 deletions(-) diff --git a/tcg/tcg.c b/tcg/tcg.c index 8aa6fc9a25..9b24b4d863 100644 --- a/tcg/tcg.c +++ b/tcg/tcg.c @@ -572,7 +572,49 @@ static ffi_type *typecode_to_ffi(int argmask) } g_assert_not_reached(); } -#endif + +static void init_ffi_layouts(void) +{ +/* g_direct_hash/equal for direct comparisons on uint32_t. */ +ffi_table = g_hash_table_new(NULL, NULL); +for (int i = 0; i < ARRAY_SIZE(all_helpers); ++i) { +uint32_t typemask = all_helpers[i].typemask; +gpointer hash = (gpointer)(uintptr_t)typemask; +struct { +ffi_cif cif; +ffi_type *args[]; +} *ca; +ffi_status status; +int nargs; + +if (g_hash_table_lookup(ffi_table, hash)) { +continue; +} + +/* Ignoring the return type, find the last non-zero field. */ +nargs = 32 - clz32(typemask >> 3); +nargs = DIV_ROUND_UP(nargs, 3); + +ca = g_malloc0(sizeof(*ca) + nargs * sizeof(ffi_type *)); +ca->cif.rtype = typecode_to_ffi(typemask & 7); +ca->cif.nargs = nargs; + +if (nargs != 0) { +ca->cif.arg_types = ca->args; +for (int j = 0; j < nargs; ++j) { +int typecode = extract32(typemask, (j + 1) * 3, 3); +ca->args[j] = typecode_to_ffi(typecode); +} +} + +status = ffi_prep_cif(>cif, FFI_DEFAULT_ABI, nargs, + ca->cif.rtype, ca->cif.arg_types); +assert(status == FFI_OK); + +g_hash_table_insert(ffi_table, hash, (gpointer)>cif); +} +} +#endif /* CONFIG_TCG_INTERPRETER */ typedef struct TCGCumulativeArgs { int arg_idx;/* tcg_gen_callN args[] */ @@ -753,44 +795,7 @@ static void tcg_context_init(unsigned max_cpus) } #ifdef CONFIG_TCG_INTERPRETER -/* g_direct_hash/equal for direct comparisons on uint32_t. */ -ffi_table = g_hash_table_new(NULL, NULL); -for (i = 0; i < ARRAY_SIZE(all_helpers); ++i) { -struct { -ffi_cif cif; -ffi_type *args[]; -} *ca; -uint32_t typemask = all_helpers[i].typemask; -gpointer hash = (gpointer)(uintptr_t)typemask; -ffi_status status; -int nargs; - -if (g_hash_table_lookup(ffi_table, hash)) { -continue; -} - -/* Ignoring the return type, find the last non-zero field. */ -nargs = 32 - clz32(typemask >> 3); -nargs = DIV_ROUND_UP(nargs, 3); - -ca = g_malloc0(sizeof(*ca) + nargs * sizeof(ffi_type *)); -ca->cif.rtype = typecode_to_ffi(typemask & 7); -ca->cif.nargs = nargs; - -if (nargs != 0) { -ca->cif.arg_types = ca->args; -for (int j = 0; j < nargs; ++j) { -int typecode = extract32(typemask, (j + 1) * 3, 3); -ca->args[j] = typecode_to_ffi(typecode); -} -} - -status = ffi_prep_cif(>cif, FFI_DEFAULT_ABI, nargs, - ca->cif.rtype, ca->cif.arg_types); -assert(status == FFI_OK); - -g_hash_table_insert(ffi_table, hash, (gpointer)>cif); -} +init_ffi_layouts(); #endif tcg_target_init(s); -- 2.38.1
[PATCH 1/3] tcg: Convert typecode_to_ffi from array to function
In the unlikely case of invalid typecode mask, the function will abort instead of returning a NULL pointer. Signed-off-by: Richard Henderson Message-Id: <2022074101.2069454-27-richard.hender...@linaro.org> [PMD: Split from bigger patch] Signed-off-by: Philippe Mathieu-Daudé --- tcg/tcg.c | 30 -- 1 file changed, 20 insertions(+), 10 deletions(-) diff --git a/tcg/tcg.c b/tcg/tcg.c index cabc397a38..8aa6fc9a25 100644 --- a/tcg/tcg.c +++ b/tcg/tcg.c @@ -554,14 +554,24 @@ static GHashTable *helper_table; #ifdef CONFIG_TCG_INTERPRETER static GHashTable *ffi_table; -static ffi_type * const typecode_to_ffi[8] = { -[dh_typecode_void] = _type_void, -[dh_typecode_i32] = _type_uint32, -[dh_typecode_s32] = _type_sint32, -[dh_typecode_i64] = _type_uint64, -[dh_typecode_s64] = _type_sint64, -[dh_typecode_ptr] = _type_pointer, -}; +static ffi_type *typecode_to_ffi(int argmask) +{ +switch (argmask) { +case dh_typecode_void: +return _type_void; +case dh_typecode_i32: +return _type_uint32; +case dh_typecode_s32: +return _type_sint32; +case dh_typecode_i64: +return _type_uint64; +case dh_typecode_s64: +return _type_sint64; +case dh_typecode_ptr: +return _type_pointer; +} +g_assert_not_reached(); +} #endif typedef struct TCGCumulativeArgs { @@ -764,14 +774,14 @@ static void tcg_context_init(unsigned max_cpus) nargs = DIV_ROUND_UP(nargs, 3); ca = g_malloc0(sizeof(*ca) + nargs * sizeof(ffi_type *)); -ca->cif.rtype = typecode_to_ffi[typemask & 7]; +ca->cif.rtype = typecode_to_ffi(typemask & 7); ca->cif.nargs = nargs; if (nargs != 0) { ca->cif.arg_types = ca->args; for (int j = 0; j < nargs; ++j) { int typecode = extract32(typemask, (j + 1) * 3, 3); -ca->args[j] = typecode_to_ffi[typecode]; +ca->args[j] = typecode_to_ffi(typecode); } } -- 2.38.1
[PATCH 3/3] tcg: Move ffi_cif pointer into TCGHelperInfo
From: Richard Henderson Instead of requiring a separate hash table lookup, put a pointer to the CIF into TCGHelperInfo. Signed-off-by: Richard Henderson Message-Id: <2022074101.2069454-27-richard.hender...@linaro.org> [PMD: Split from bigger patch] Signed-off-by: Philippe Mathieu-Daudé --- tcg/tcg-internal.h | 7 +++ tcg/tcg.c | 26 ++ 2 files changed, 21 insertions(+), 12 deletions(-) diff --git a/tcg/tcg-internal.h b/tcg/tcg-internal.h index c7e87e193d..6e50aeba3a 100644 --- a/tcg/tcg-internal.h +++ b/tcg/tcg-internal.h @@ -25,6 +25,10 @@ #ifndef TCG_INTERNAL_H #define TCG_INTERNAL_H +#ifdef CONFIG_TCG_INTERPRETER +#include +#endif + #define TCG_HIGHWATER 1024 /* @@ -57,6 +61,9 @@ typedef struct TCGCallArgumentLoc { typedef struct TCGHelperInfo { void *func; const char *name; +#ifdef CONFIG_TCG_INTERPRETER +ffi_cif *cif; +#endif unsigned typemask : 32; unsigned flags : 8; unsigned nr_in : 8; diff --git a/tcg/tcg.c b/tcg/tcg.c index 9b24b4d863..d6a3036412 100644 --- a/tcg/tcg.c +++ b/tcg/tcg.c @@ -552,8 +552,6 @@ static TCGHelperInfo all_helpers[] = { static GHashTable *helper_table; #ifdef CONFIG_TCG_INTERPRETER -static GHashTable *ffi_table; - static ffi_type *typecode_to_ffi(int argmask) { switch (argmask) { @@ -576,9 +574,11 @@ static ffi_type *typecode_to_ffi(int argmask) static void init_ffi_layouts(void) { /* g_direct_hash/equal for direct comparisons on uint32_t. */ -ffi_table = g_hash_table_new(NULL, NULL); +GHashTable *ffi_table = g_hash_table_new(NULL, NULL); + for (int i = 0; i < ARRAY_SIZE(all_helpers); ++i) { -uint32_t typemask = all_helpers[i].typemask; +TCGHelperInfo *info = _helpers[i]; +unsigned typemask = info->typemask; gpointer hash = (gpointer)(uintptr_t)typemask; struct { ffi_cif cif; @@ -586,8 +586,11 @@ static void init_ffi_layouts(void) } *ca; ffi_status status; int nargs; +ffi_cif *cif; -if (g_hash_table_lookup(ffi_table, hash)) { +cif = g_hash_table_lookup(ffi_table, hash); +if (cif) { +info->cif = cif; continue; } @@ -611,8 +614,12 @@ static void init_ffi_layouts(void) ca->cif.rtype, ca->cif.arg_types); assert(status == FFI_OK); -g_hash_table_insert(ffi_table, hash, (gpointer)>cif); +cif = >cif; +info->cif = cif; +g_hash_table_insert(ffi_table, hash, (gpointer)cif); } + +g_hash_table_destroy(ffi_table); } #endif /* CONFIG_TCG_INTERPRETER */ @@ -4413,12 +4420,7 @@ static void tcg_reg_alloc_call(TCGContext *s, TCGOp *op) } #ifdef CONFIG_TCG_INTERPRETER -{ -gpointer hash = (gpointer)(uintptr_t)info->typemask; -ffi_cif *cif = g_hash_table_lookup(ffi_table, hash); -assert(cif != NULL); -tcg_out_call(s, tcg_call_func(op), cif); -} +tcg_out_call(s, tcg_call_func(op), info->cif); #else tcg_out_call(s, tcg_call_func(op)); #endif -- 2.38.1
[PATCH 0/3] tcg: Move ffi_cif pointer into TCGHelperInfo (splitted)
Hi Richard, Your "Move ffi_cif pointer into TCGHelperInfo" patch was not obvious (to me) at first, so I split it in 3 trivial patches. Philippe Mathieu-Daudé (2): tcg: Convert typecode_to_ffi from array to function tcg: Factor init_ffi_layouts() out of tcg_context_init() Richard Henderson (1): tcg: Move ffi_cif pointer into TCGHelperInfo tcg/tcg-internal.h | 7 +++ tcg/tcg.c | 125 + 2 files changed, 78 insertions(+), 54 deletions(-) -- 2.38.1
Re: [PATCH for-8.0 07/29] accel/tcg: Honor atomicity of loads
On 11/22/22 06:35, Peter Maydell wrote: +/* + * Here we have the architectural atomicity of the operation. + * However, when executing in a serial context, we need no extra + * host atomicity in order to avoid racing. This reduction + * avoids looping with cpu_loop_exit_atomic. + */ +if (cpu_in_serial_context(env_cpu(env))) { Is it OK to use cpu_in_serial_context() here ? Even if there's no other vCPU executing in parallel, there might be device model code doing a memory write in the iothread, I think. Well, it's no different from how we currently treat compare-and-swap expansion. But you have a point -- we should probably be doing something with the iothread lock for both EXCP_ATOMIC and round-robin mode. r~
Re: [PATCH] vhost: mask VIRTIO_F_RING_RESET for vhost and vhost-user devices
On Tue, Nov 22, 2022 at 4:13 AM Jason Wang wrote: > > On Mon, Nov 21, 2022 at 6:11 PM Stefano Garzarella > wrote: > > > > Commit 69e1c14aa2 ("virtio: core: vq reset feature negotation support") > > enabled VIRTIO_F_RING_RESET by default for all virtio devices. > > > > This feature is not currently emulated by QEMU, so for vhost and > > vhost-user devices we need to make sure it is supported by the offloaded > > device emulation (in-kernel or in another process). > > To do this we need to add VIRTIO_F_RING_RESET to the features bitmap > > passed to vhost_get_features(). This way it will be masked if the device > > does not support it. > > > > This issue was initially discovered with vhost-vsock and vhost-user-vsock, > > and then also tested with vhost-user-rng which confirmed the same issue. > > They fail when sending features through VHOST_SET_FEATURES ioctl or > > VHOST_USER_SET_FEATURES message, since VIRTIO_F_RING_RESET is negotiated > > by the guest (Linux >= v6.0), but not supported by the device. > > > > Fixes: 69e1c14aa2 ("virtio: core: vq reset feature negotation support") > > Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1318 > > Signed-off-by: Stefano Garzarella > > Acked-by: Jason Wang > > > --- > > > > To prevent this problem in the future, perhaps we should provide a function > > (e.g. vhost_device_get_features) where we go to mask all non-device-specific > > features (e.g VIRTIO_F_*, VIRTIO_RING_F_*) that are not emulated by QEMU but > > we expect them to be emulated by the vhost or vhost-user devices. > > Adding Eugenio, this might not be true if we want to use shadow > virtqueue for emulating some features? > I think we should be able to introduce that in the (hypothetical) vhost_device_get_features if SVQ starts emulating a ring feature, isn't it? I think a first version not aware of SVQ is fine anyway, and to introduce it later should be easy. Thanks! > Thanks > > > Then we can call it in all .get_features callbacks just before return the > > features. > > > > What do you think? > > > > But maybe better to do that for the next release, I will send an RFC. > > > > Thanks, > > Stefano > > --- > > hw/block/vhost-user-blk.c | 1 + > > hw/net/vhost_net.c | 1 + > > hw/scsi/vhost-scsi.c | 1 + > > hw/scsi/vhost-user-scsi.c | 1 + > > hw/virtio/vhost-user-fs.c | 1 + > > hw/virtio/vhost-user-gpio.c| 1 + > > hw/virtio/vhost-user-i2c.c | 1 + > > hw/virtio/vhost-user-rng.c | 11 +-- > > hw/virtio/vhost-vsock-common.c | 1 + > > net/vhost-vdpa.c | 1 + > > 10 files changed, 18 insertions(+), 2 deletions(-) > > > > diff --git a/hw/block/vhost-user-blk.c b/hw/block/vhost-user-blk.c > > index 16ad400889..0d5190accf 100644 > > --- a/hw/block/vhost-user-blk.c > > +++ b/hw/block/vhost-user-blk.c > > @@ -52,6 +52,7 @@ static const int user_feature_bits[] = { > > VIRTIO_F_NOTIFY_ON_EMPTY, > > VIRTIO_F_RING_PACKED, > > VIRTIO_F_IOMMU_PLATFORM, > > +VIRTIO_F_RING_RESET, > > VHOST_INVALID_FEATURE_BIT > > }; > > > > diff --git a/hw/net/vhost_net.c b/hw/net/vhost_net.c > > index feda448878..26e4930676 100644 > > --- a/hw/net/vhost_net.c > > +++ b/hw/net/vhost_net.c > > @@ -75,6 +75,7 @@ static const int user_feature_bits[] = { > > VIRTIO_NET_F_MTU, > > VIRTIO_F_IOMMU_PLATFORM, > > VIRTIO_F_RING_PACKED, > > +VIRTIO_F_RING_RESET, > > VIRTIO_NET_F_RSS, > > VIRTIO_NET_F_HASH_REPORT, > > > > diff --git a/hw/scsi/vhost-scsi.c b/hw/scsi/vhost-scsi.c > > index bdf337a7a2..6a0fd0dfb1 100644 > > --- a/hw/scsi/vhost-scsi.c > > +++ b/hw/scsi/vhost-scsi.c > > @@ -38,6 +38,7 @@ static const int kernel_feature_bits[] = { > > VIRTIO_RING_F_INDIRECT_DESC, > > VIRTIO_RING_F_EVENT_IDX, > > VIRTIO_SCSI_F_HOTPLUG, > > +VIRTIO_F_RING_RESET, > > VHOST_INVALID_FEATURE_BIT > > }; > > > > diff --git a/hw/scsi/vhost-user-scsi.c b/hw/scsi/vhost-user-scsi.c > > index bc37317d55..b7a71a802c 100644 > > --- a/hw/scsi/vhost-user-scsi.c > > +++ b/hw/scsi/vhost-user-scsi.c > > @@ -36,6 +36,7 @@ static const int user_feature_bits[] = { > > VIRTIO_RING_F_INDIRECT_DESC, > > VIRTIO_RING_F_EVENT_IDX, > > VIRTIO_SCSI_F_HOTPLUG, > > +VIRTIO_F_RING_RESET, > > VHOST_INVALID_FEATURE_BIT > > }; > > > > diff --git a/hw/virtio/vhost-user-fs.c b/hw/virtio/vhost-user-fs.c > > index 1c40f42045..dc4014cdef 100644 > > --- a/hw/virtio/vhost-user-fs.c > > +++ b/hw/virtio/vhost-user-fs.c > > @@ -32,6 +32,7 @@ static const int user_feature_bits[] = { > > VIRTIO_F_NOTIFY_ON_EMPTY, > > VIRTIO_F_RING_PACKED, > > VIRTIO_F_IOMMU_PLATFORM, > > +VIRTIO_F_RING_RESET, > > > > VHOST_INVALID_FEATURE_BIT > > }; > > diff --git a/hw/virtio/vhost-user-gpio.c b/hw/virtio/vhost-user-gpio.c > > index 677d1c7730..5851cb3bc9 100644 > > --- a/hw/virtio/vhost-user-gpio.c > > +++ b/hw/virtio/vhost-user-gpio.c > > @@ -24,6 +24,7 @@ static const int feature_bits[] =
Re: [RFC 2/7] migration: No save_live_pending() method uses the QEMUFile parameter
* Juan Quintela (quint...@redhat.com) wrote: > So remove it everywhere. > > Signed-off-by: Juan Quintela Reviewed-by: Dr. David Alan Gilbert > --- > include/migration/register.h | 6 ++ > migration/savevm.h | 2 +- > hw/s390x/s390-stattrib.c | 2 +- > hw/vfio/migration.c| 6 ++ > migration/block-dirty-bitmap.c | 5 ++--- > migration/block.c | 2 +- > migration/migration.c | 3 +-- > migration/ram.c| 2 +- > migration/savevm.c | 5 ++--- > 9 files changed, 13 insertions(+), 20 deletions(-) > > diff --git a/include/migration/register.h b/include/migration/register.h > index 1950fee6a8..5b5424ed8f 100644 > --- a/include/migration/register.h > +++ b/include/migration/register.h > @@ -46,10 +46,8 @@ typedef struct SaveVMHandlers { > > /* This runs outside the iothread lock! */ > int (*save_setup)(QEMUFile *f, void *opaque); > -void (*save_live_pending)(QEMUFile *f, void *opaque, > - uint64_t threshold_size, > - uint64_t *rest_precopy, > - uint64_t *rest_postcopy); > +void (*save_live_pending)(void *opaque, uint64_t threshold_size, > + uint64_t *rest_precopy, uint64_t > *rest_postcopy); > /* Note for save_live_pending: > * - res_precopy is for data which must be migrated in precopy > * phase or in stopped state, in other words - before target > diff --git a/migration/savevm.h b/migration/savevm.h > index 9bd55c336c..98fae6f9b3 100644 > --- a/migration/savevm.h > +++ b/migration/savevm.h > @@ -40,7 +40,7 @@ void qemu_savevm_state_cleanup(void); > void qemu_savevm_state_complete_postcopy(QEMUFile *f); > int qemu_savevm_state_complete_precopy(QEMUFile *f, bool iterable_only, > bool inactivate_disks); > -void qemu_savevm_state_pending(QEMUFile *f, uint64_t max_size, > +void qemu_savevm_state_pending(uint64_t max_size, > uint64_t *res_precopy, uint64_t > *res_postcopy); > void qemu_savevm_send_ping(QEMUFile *f, uint32_t value); > void qemu_savevm_send_open_return_path(QEMUFile *f); > diff --git a/hw/s390x/s390-stattrib.c b/hw/s390x/s390-stattrib.c > index ee60b53da4..9b74eeadf3 100644 > --- a/hw/s390x/s390-stattrib.c > +++ b/hw/s390x/s390-stattrib.c > @@ -182,7 +182,7 @@ static int cmma_save_setup(QEMUFile *f, void *opaque) > return 0; > } > > -static void cmma_save_pending(QEMUFile *f, void *opaque, uint64_t max_size, > +static void cmma_save_pending(void *opaque, uint64_t max_size, >uint64_t *res_precopy, uint64_t *res_postcopy) > { > S390StAttribState *sas = S390_STATTRIB(opaque); > diff --git a/hw/vfio/migration.c b/hw/vfio/migration.c > index 3423f113f0..760d5f3c5c 100644 > --- a/hw/vfio/migration.c > +++ b/hw/vfio/migration.c > @@ -456,10 +456,8 @@ static void vfio_save_cleanup(void *opaque) > trace_vfio_save_cleanup(vbasedev->name); > } > > -static void vfio_save_pending(QEMUFile *f, void *opaque, > - uint64_t threshold_size, > - uint64_t *res_precopy, > - uint64_t *res_postcopy) > +static void vfio_save_pending(void *opaque, uint64_t threshold_size, > + uint64_t *res_precopy, uint64_t *res_postcopy) > { > VFIODevice *vbasedev = opaque; > VFIOMigration *migration = vbasedev->migration; > diff --git a/migration/block-dirty-bitmap.c b/migration/block-dirty-bitmap.c > index dfea546330..a445bdc3c3 100644 > --- a/migration/block-dirty-bitmap.c > +++ b/migration/block-dirty-bitmap.c > @@ -761,9 +761,8 @@ static int dirty_bitmap_save_complete(QEMUFile *f, void > *opaque) > return 0; > } > > -static void dirty_bitmap_save_pending(QEMUFile *f, void *opaque, > - uint64_t max_size, > - uint64_t *res_precopy, > +static void dirty_bitmap_save_pending(void *opaque, uint64_t max_size, > + uint64_t *res_precopy, >uint64_t *res_postcopy) > { > DBMSaveState *s = &((DBMState *)opaque)->save; > diff --git a/migration/block.c b/migration/block.c > index 4ae8f837b0..b3d680af75 100644 > --- a/migration/block.c > +++ b/migration/block.c > @@ -862,7 +862,7 @@ static int block_save_complete(QEMUFile *f, void *opaque) > return 0; > } > > -static void block_save_pending(QEMUFile *f, void *opaque, uint64_t max_size, > +static void block_save_pending(void *opaque, uint64_t max_size, > uint64_t *res_precopy, > uint64_t *res_postcopy) > { > diff --git a/migration/migration.c b/migration/migration.c > index 440aa62f16..038fc58a96 100644 > --- a/migration/migration.c > +++ b/migration/migration.c > @@
Re: [PATCH v3 1/2] io: Add support for MSG_PEEK for socket channel
On Tue, Nov 22, 2022 at 12:16:08PM -0500, Peter Xu wrote: > On Tue, Nov 22, 2022 at 10:12:25PM +0530, manish.mishra wrote: > > > > On 22/11/22 10:03 pm, Peter Xu wrote: > > > On Tue, Nov 22, 2022 at 11:29:05AM -0500, Peter Xu wrote: > > > > On Tue, Nov 22, 2022 at 11:10:18AM -0500, Peter Xu wrote: > > > > > On Tue, Nov 22, 2022 at 09:01:59PM +0530, manish.mishra wrote: > > > > > > On 22/11/22 8:19 pm, Daniel P. Berrangé wrote: > > > > > > > On Tue, Nov 22, 2022 at 09:41:01AM -0500, Peter Xu wrote: > > > > > > > > On Tue, Nov 22, 2022 at 02:38:53PM +0530, manish.mishra wrote: > > > > > > > > > On 22/11/22 2:30 pm, Daniel P. Berrangé wrote: > > > > > > > > > > On Sat, Nov 19, 2022 at 09:36:14AM +, manish.mishra > > > > > > > > > > wrote: > > > > > > > > > > > MSG_PEEK reads from the peek of channel, The data is > > > > > > > > > > > treated as > > > > > > > > > > > unread and the next read shall still return this data. > > > > > > > > > > > This > > > > > > > > > > > support is currently added only for socket class. Extra > > > > > > > > > > > parameter > > > > > > > > > > > 'flags' is added to io_readv calls to pass extra read > > > > > > > > > > > flags like > > > > > > > > > > > MSG_PEEK. > > > > > > > > > > > > > > > > > > > > > > Suggested-by: Daniel P. Berrangé > > > > > > > > > > Signed-off-by: manish.mishra > > > > > > > > > > > --- > > > > > > > > > > > chardev/char-socket.c | 4 +- > > > > > > > > > > > include/io/channel.h| 83 > > > > > > > > > > > + > > > > > > > > > > > io/channel-buffer.c | 1 + > > > > > > > > > > > io/channel-command.c| 1 + > > > > > > > > > > > io/channel-file.c | 1 + > > > > > > > > > > > io/channel-null.c | 1 + > > > > > > > > > > > io/channel-socket.c | 16 +- > > > > > > > > > > > io/channel-tls.c| 1 + > > > > > > > > > > > io/channel-websock.c| 1 + > > > > > > > > > > > io/channel.c| 73 > > > > > > > > > > > +++-- > > > > > > > > > > > migration/channel-block.c | 1 + > > > > > > > > > > > scsi/qemu-pr-helper.c | 2 +- > > > > > > > > > > > tests/qtest/tpm-emu.c | 2 +- > > > > > > > > > > > tests/unit/test-io-channel-socket.c | 1 + > > > > > > > > > > > util/vhost-user-server.c| 2 +- > > > > > > > > > > > 15 files changed, 179 insertions(+), 11 deletions(-) > > > > > > > > > > > diff --git a/io/channel-socket.c b/io/channel-socket.c > > > > > > > > > > > index b76dca9cc1..a06b24766d 100644 > > > > > > > > > > > --- a/io/channel-socket.c > > > > > > > > > > > +++ b/io/channel-socket.c > > > > > > > > > > > @@ -406,6 +406,8 @@ > > > > > > > > > > > qio_channel_socket_accept(QIOChannelSocket *ioc, > > > > > > > > > > > } > > > > > > > > > > > #endif /* WIN32 */ > > > > > > > > > > > +qio_channel_set_feature(QIO_CHANNEL(cioc), > > > > > > > > > > > QIO_CHANNEL_FEATURE_READ_MSG_PEEK); > > > > > > > > > > > + > > > > > > > > > > This covers the incoming server side socket. > > > > > > > > > > > > > > > > > > > > This also needs to be set in outgoing client side socket in > > > > > > > > > > qio_channel_socket_connect_async > > > > > > > > > Yes sorry, i considered only current use-case, but as it is > > > > > > > > > generic one both should be there. Thanks will update it. > > > > > > > > > > > > > > > > > > > > @@ -705,7 +718,6 @@ static ssize_t > > > > > > > > > > > qio_channel_socket_writev(QIOChannel *ioc, > > > > > > > > > > > } > > > > > > > > > > > #endif /* WIN32 */ > > > > > > > > > > > - > > > > > > > > > > > #ifdef QEMU_MSG_ZEROCOPY > > > > > > > > > > > static int qio_channel_socket_flush(QIOChannel *ioc, > > > > > > > > > > > Error **errp) > > > > > > > > > > Please remove this unrelated whitespace change. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > @@ -109,6 +117,37 @@ int > > > > > > > > > > > qio_channel_readv_all_eof(QIOChannel *ioc, > > > > > > > > > > > return qio_channel_readv_full_all_eof(ioc, iov, > > > > > > > > > > > niov, NULL, NULL, errp); > > > > > > > > > > > } > > > > > > > > > > > +int qio_channel_readv_peek_all_eof(QIOChannel *ioc, > > > > > > > > > > > + const struct iovec > > > > > > > > > > > *iov, > > > > > > > > > > > + size_t niov, > > > > > > > > > > > + Error **errp) > > > > > > > > > > > +{ > > > > > > > > > > > + ssize_t len = 0; > > > > > > > > > > > + ssize_t total = iov_size(iov, niov); > > > > > > > > > > > + > > > > > > > > > > > + while (len < total) { > > > > > > > > > > > + len = qio_channel_readv_full(ioc, iov, niov, NULL, > > > > > > > > > > > +
Re: [PATCH for-7.2 1/5] hw/nvme: fix aio cancel in format
On Tue, Nov 22, 2022 at 09:13:44AM +0100, Klaus Jensen wrote: > There are several bugs in the async cancel code for the Format command. > > Firstly, cancelling a format operation neglects to set iocb->ret as well > as clearing the iocb->aiocb after cancelling the underlying aiocb which > causes the aio callback to ignore the cancellation. Trivial fix. > > Secondly, and worse, because the request is queued up for posting to the > CQ in a bottom half, if the cancellation is due to the submission queue > being deleted (which calls blk_aio_cancel), the req structure is > deallocated in nvme_del_sq prior to the bottom half being schedulued. > > Fix this by simply removing the bottom half, there is no reason to defer > it anyway. I thought for sure I'd find a reason defered execution was needed, but hey, it looks perfectly fine with this change! > diff --git a/hw/nvme/ctrl.c b/hw/nvme/ctrl.c > index ac3885ce5079..26b53469328f 100644 > --- a/hw/nvme/ctrl.c > +++ b/hw/nvme/ctrl.c > @@ -5756,14 +5756,15 @@ typedef struct NvmeFormatAIOCB { > uint8_t pil; > } NvmeFormatAIOCB; I think you can remove the QEMUBH member from the above struct with this patch. Otherwise looks good. Reviewed-by: Keith Busch
Re: [PATCH v3 1/2] io: Add support for MSG_PEEK for socket channel
On Tue, Nov 22, 2022 at 10:12:25PM +0530, manish.mishra wrote: > > On 22/11/22 10:03 pm, Peter Xu wrote: > > On Tue, Nov 22, 2022 at 11:29:05AM -0500, Peter Xu wrote: > > > On Tue, Nov 22, 2022 at 11:10:18AM -0500, Peter Xu wrote: > > > > On Tue, Nov 22, 2022 at 09:01:59PM +0530, manish.mishra wrote: > > > > > On 22/11/22 8:19 pm, Daniel P. Berrangé wrote: > > > > > > On Tue, Nov 22, 2022 at 09:41:01AM -0500, Peter Xu wrote: > > > > > > > On Tue, Nov 22, 2022 at 02:38:53PM +0530, manish.mishra wrote: > > > > > > > > On 22/11/22 2:30 pm, Daniel P. Berrangé wrote: > > > > > > > > > On Sat, Nov 19, 2022 at 09:36:14AM +, manish.mishra wrote: > > > > > > > > > > MSG_PEEK reads from the peek of channel, The data is > > > > > > > > > > treated as > > > > > > > > > > unread and the next read shall still return this data. This > > > > > > > > > > support is currently added only for socket class. Extra > > > > > > > > > > parameter > > > > > > > > > > 'flags' is added to io_readv calls to pass extra read flags > > > > > > > > > > like > > > > > > > > > > MSG_PEEK. > > > > > > > > > > > > > > > > > > > > Suggested-by: Daniel P. Berrangé > > > > > > > > > Signed-off-by: manish.mishra > > > > > > > > > > --- > > > > > > > > > > chardev/char-socket.c | 4 +- > > > > > > > > > > include/io/channel.h| 83 > > > > > > > > > > + > > > > > > > > > > io/channel-buffer.c | 1 + > > > > > > > > > > io/channel-command.c| 1 + > > > > > > > > > > io/channel-file.c | 1 + > > > > > > > > > > io/channel-null.c | 1 + > > > > > > > > > > io/channel-socket.c | 16 +- > > > > > > > > > > io/channel-tls.c| 1 + > > > > > > > > > > io/channel-websock.c| 1 + > > > > > > > > > > io/channel.c| 73 > > > > > > > > > > +++-- > > > > > > > > > > migration/channel-block.c | 1 + > > > > > > > > > > scsi/qemu-pr-helper.c | 2 +- > > > > > > > > > > tests/qtest/tpm-emu.c | 2 +- > > > > > > > > > > tests/unit/test-io-channel-socket.c | 1 + > > > > > > > > > > util/vhost-user-server.c| 2 +- > > > > > > > > > > 15 files changed, 179 insertions(+), 11 deletions(-) > > > > > > > > > > diff --git a/io/channel-socket.c b/io/channel-socket.c > > > > > > > > > > index b76dca9cc1..a06b24766d 100644 > > > > > > > > > > --- a/io/channel-socket.c > > > > > > > > > > +++ b/io/channel-socket.c > > > > > > > > > > @@ -406,6 +406,8 @@ > > > > > > > > > > qio_channel_socket_accept(QIOChannelSocket *ioc, > > > > > > > > > > } > > > > > > > > > > #endif /* WIN32 */ > > > > > > > > > > +qio_channel_set_feature(QIO_CHANNEL(cioc), > > > > > > > > > > QIO_CHANNEL_FEATURE_READ_MSG_PEEK); > > > > > > > > > > + > > > > > > > > > This covers the incoming server side socket. > > > > > > > > > > > > > > > > > > This also needs to be set in outgoing client side socket in > > > > > > > > > qio_channel_socket_connect_async > > > > > > > > Yes sorry, i considered only current use-case, but as it is > > > > > > > > generic one both should be there. Thanks will update it. > > > > > > > > > > > > > > > > > > @@ -705,7 +718,6 @@ static ssize_t > > > > > > > > > > qio_channel_socket_writev(QIOChannel *ioc, > > > > > > > > > > } > > > > > > > > > > #endif /* WIN32 */ > > > > > > > > > > - > > > > > > > > > > #ifdef QEMU_MSG_ZEROCOPY > > > > > > > > > > static int qio_channel_socket_flush(QIOChannel *ioc, > > > > > > > > > > Error **errp) > > > > > > > > > Please remove this unrelated whitespace change. > > > > > > > > > > > > > > > > > > > > > > > > > > > > @@ -109,6 +117,37 @@ int > > > > > > > > > > qio_channel_readv_all_eof(QIOChannel *ioc, > > > > > > > > > > return qio_channel_readv_full_all_eof(ioc, iov, > > > > > > > > > > niov, NULL, NULL, errp); > > > > > > > > > > } > > > > > > > > > > +int qio_channel_readv_peek_all_eof(QIOChannel *ioc, > > > > > > > > > > + const struct iovec *iov, > > > > > > > > > > + size_t niov, > > > > > > > > > > + Error **errp) > > > > > > > > > > +{ > > > > > > > > > > + ssize_t len = 0; > > > > > > > > > > + ssize_t total = iov_size(iov, niov); > > > > > > > > > > + > > > > > > > > > > + while (len < total) { > > > > > > > > > > + len = qio_channel_readv_full(ioc, iov, niov, NULL, > > > > > > > > > > +NULL, > > > > > > > > > > QIO_CHANNEL_READ_FLAG_MSG_PEEK, errp); > > > > > > > > > > + > > > > > > > > > > + if (len == QIO_CHANNEL_ERR_BLOCK) { > > > > > > > > > > +if (qemu_in_coroutine()) { > > > > > > > > > > +
Re: [PATCH 1/2] hw/xen/xen_pt: Call default handler only if no custom one is set
On Mon, Nov 14, 2022 at 08:20:10PM +0100, Marek Marczykowski-Górecki wrote: > diff --git a/hw/xen/xen_pt.c b/hw/xen/xen_pt.c > index 0ec7e52183..269bd26109 100644 > --- a/hw/xen/xen_pt.c > +++ b/hw/xen/xen_pt.c > @@ -255,6 +255,7 @@ static void xen_pt_pci_write_config(PCIDevice *d, > uint32_t addr, > uint32_t find_addr = addr; > XenPTRegInfo *reg = NULL; > bool wp_flag = false; > +uint32_t emul_mask = 0, write_val; > > if (xen_pt_pci_config_access_check(d, addr, len)) { > return; > @@ -310,7 +311,6 @@ static void xen_pt_pci_write_config(PCIDevice *d, > uint32_t addr, > } > > memory_region_transaction_begin(); > -pci_default_write_config(d, addr, val, len); > > /* adjust the read and write value to appropriate CFC-CFF window */ > read_val <<= (addr & 3) << 3; > @@ -370,6 +370,8 @@ static void xen_pt_pci_write_config(PCIDevice *d, > uint32_t addr, > return; > } > > +emul_mask |= ( (1 << (reg->size * 8) ) - 1 ) << ((find_addr & 3) > * 8); > + > /* calculate next address to find */ > emul_len -= reg->size; > if (emul_len > 0) { > @@ -396,6 +398,24 @@ static void xen_pt_pci_write_config(PCIDevice *d, > uint32_t addr, > /* need to shift back before passing them to xen_host_pci_set_block. */ > val >>= (addr & 3) << 3; > > +/* store emulated registers that didn't have specific hooks */ > +write_val = val; > +for (index = 0; emul_mask; index += emul_len) { `index` isn't used, was it meant to be use for something? > +emul_len = 0; > +while (emul_mask & 0xff) { > +emul_len++; This seems to count the number of byte that have a hook (xen_pt_find_reg() found a `reg_entry`). This loop should count instead the number of bytes for which no `reg_entry` have been found, right? Shouldn't the loop count when a byte in emul_mask is unset? > +emul_mask >>= 8; > +} > +if (emul_len) { > +uint32_t mask = ((1 << (emul_len * 8)) - 1); > +pci_default_write_config(d, addr, write_val & mask, emul_len); `addr` isn't updated in the loop, aren't we going to write bytes to the wrong place? If for example "emul_mask == 0x00ff00ff" ? > +write_val >>= emul_len * 8; > +} else { > +emul_mask >>= 8; > +write_val >>= 8; > +} > +} Thanks, -- Anthony PERARD
Re: [PULL for 7.2-rc2 0/3] loongarch for 7.2-rc2 patches
Applied, thanks. Please update the changelog at https://wiki.qemu.org/ChangeLog/7.2 for any user-visible changes. signature.asc Description: PGP signature
Re: [PULL 0/8] pc,virtio: regression, test fixes
Applied, thanks. Please update the changelog at https://wiki.qemu.org/ChangeLog/7.2 for any user-visible changes. signature.asc Description: PGP signature
Re: [PULL v2 for 7.2-rc2 00/11] testing and doc updates
Applied, thanks. Please update the changelog at https://wiki.qemu.org/ChangeLog/7.2 for any user-visible changes. signature.asc Description: PGP signature
Re: Plugin Memory Callback Debugging
On Nov 21 22:02, Alex Bennée wrote: > > Aaron Lindsay writes: > > > Sorry, left off the very end of my timeline: > > > > On Nov 18 16:58, Aaron Lindsay wrote: > >> I have, so far, discovered the following timeline: > >> 1. My plugin receives a instruction execution callback for a load > >>instruction. At this time, cpu->plugin_mem_cbs points to the same > >>memory which will later be freed > >> 2. During the handling of this callback, my plugin calls > >qemu_plugin_reset() > > The final plugin reset should only execute in the safe async context > (i.e. no other vCPUs running code). That flushes all current generated > code. > > >> 3. Ostensibly something goes wrong here with the cleanup of > >>cpu->plugin_mem_cbs??? > > This may be missed by the reset path (hence your patch) but it should be > being reset every instruction we instrument. > > >> 4. Step 2 triggers the TBs to be flushed, which frees the memory pointed > >>to by cpu->plugin_mem_cbs > > > > 5. A store exclusive instruction is translated and then executed, which > >requires the use of a helper. When executed, this helper checks > >cpu->plugin_mem_cbs, which is non-null, so it attempts to dereference > >and use it, resulting in the assertion. > > It should be being reset for each instruction I think. FYI - I suspect my above presentation of the problem suffered from the "searching where the streetlamp is instead of where you lost something" problem. In other words, I did/do observe the error at reset, but I now believe that is merely where it is easiest to observe. The cpu->plugin_mem_cbs doesn't appear to be reset at the end of instructions and manifests at reset because that's when the underlying memory is freed. -Aaron
Re: [PATCH] multifd: Updated QAPI format for 'migrate' qemu monitor command
On Tue, Nov 22, 2022 at 09:27:52PM +0530, manish.mishra wrote: > > On 22/11/22 2:53 pm, Daniel P. Berrangé wrote: > > For our future sanity I think we need to define a brand new migration > > protocol which is bidirectional from the start. A large number of the > > MigrateParameters would become obsolete immediately, as QEMU could > > negotiate them automatically. This would let QEMU introduce new > > migration features without requiring any work in libvirt in many > > cases. Libvirt should only be required when there are performance > > tunables that need exposing, never for protocol feature negotiation. > > > > I think introducing a new QMP command, without introducing a fully > > new protocol would be a big mistake as the QMP command is not the > > problem we have. > > > > Daniel, Totally agree on above mentioned discussion. Does this > account for verifying other things too along with migration > capabilities e.g. if qemu machine type, vm config or cpu > features are excatly same of both side. Currently libvirt > takes that responsibility, but as you mentioned libvirt may > take some time to get to level where qemu is so causing > potential issues till then. Similar to this discussion we > had on libvirt list > https://www.mail-archive.com/libvir-list@redhat.com/msg233725.html. > If these things too were managed by qemu indepenedelty > things could have been much better. I mean those too are > kind of related to live migration. :) Today libvirt has no choice, because if there's an ABI compat mistake in the dest QEMU config, vs the src QEMU config, then when loading VMstate on the target, the dst QEMU will often be unable to de-serialize the migration device stream and end up printing an error on stderr and exiting. Getting that info back to the src QEMU is impossible. If QEMU had a bi-directional migration stream, then the dst QEMU could simply send an error message back to the src QEMU and 'query-migrate' could actually display this. There may still be validation libvirt wants to do as well, but at least there would be the possiblity of offloading to QEMU without sacrificing error reporting. With regards, Daniel -- |: https://berrange.com -o-https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o-https://fstop138.berrange.com :| |: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|
Re: [PATCH for-8.0 v3 14/45] tcg: Introduce tcg_type_size
On 11/22/22 03:30, Philippe Mathieu-Daudé wrote: On 11/11/22 08:40, Richard Henderson wrote: Add a helper function for computing the size of a type. Signed-off-by: Richard Henderson --- include/tcg/tcg.h | 16 tcg/tcg.c | 26 -- 2 files changed, 28 insertions(+), 14 deletions(-) diff --git a/include/tcg/tcg.h b/include/tcg/tcg.h index f2da340bb9..8bcd60d0ed 100644 --- a/include/tcg/tcg.h +++ b/include/tcg/tcg.h @@ -319,6 +319,22 @@ typedef enum TCGType { #endif } TCGType; +/** + * tcg_type_size + * @t: type + * + * Return the size of the type in bytes. + */ +static inline int tcg_type_size(TCGType t) +{ + unsigned i = t; + if (i >= TCG_TYPE_V64) { + tcg_debug_assert(i < TCG_TYPE_COUNT); + i -= TCG_TYPE_V64 - 1; + } + return 4 << i; I'd feel safer if we assign TCG_TYPE_I32 .. TCG_TYPE_V256 in TCGType, just in case. What do you mean? r~
Re: [PATCH] vhost-vdpa: skip TPM CRB memory section
On Tue, Nov 22, 2022 at 3:53 PM wrote: > > From: Marc-André Lureau > > 851d6d1a0f ("vfio/common: remove spurious tpm-crb-cmd misalignment > warning") removed the warning on vfio_listener_region_add() path. > > An error is reported for vhost-vdpa case: > qemu-kvm: vhost_vdpa_listener_region_add received unaligned region > > Skip the CRB device. > > Fixes: > https://bugzilla.redhat.com/show_bug.cgi?id=2141965 > > Signed-off-by: Marc-André Lureau Acked-by: Eugenio Pérez And CCing Jason too. > --- > hw/virtio/vhost-vdpa.c | 6 ++ > 1 file changed, 6 insertions(+) > > diff --git a/hw/virtio/vhost-vdpa.c b/hw/virtio/vhost-vdpa.c > index 7468e44b87..9d7206e4b8 100644 > --- a/hw/virtio/vhost-vdpa.c > +++ b/hw/virtio/vhost-vdpa.c > @@ -19,6 +19,7 @@ > #include "hw/virtio/virtio-net.h" > #include "hw/virtio/vhost-shadow-virtqueue.h" > #include "hw/virtio/vhost-vdpa.h" > +#include "sysemu/tpm.h" > #include "exec/address-spaces.h" > #include "migration/blocker.h" > #include "qemu/cutils.h" > @@ -46,6 +47,11 @@ static bool > vhost_vdpa_listener_skipped_section(MemoryRegionSection *section, > { > Int128 llend; > > +if (TPM_IS_CRB(section->mr->owner)) { > +/* The CRB command buffer has its base address unaligned. */ > +return true; > +} > + > if ((!memory_region_is_ram(section->mr) && > !memory_region_is_iommu(section->mr)) || > memory_region_is_protected(section->mr) || > -- > 2.38.1 >
Re: [PATCH v3 1/2] io: Add support for MSG_PEEK for socket channel
On 22/11/22 10:03 pm, Peter Xu wrote: On Tue, Nov 22, 2022 at 11:29:05AM -0500, Peter Xu wrote: On Tue, Nov 22, 2022 at 11:10:18AM -0500, Peter Xu wrote: On Tue, Nov 22, 2022 at 09:01:59PM +0530, manish.mishra wrote: On 22/11/22 8:19 pm, Daniel P. Berrangé wrote: On Tue, Nov 22, 2022 at 09:41:01AM -0500, Peter Xu wrote: On Tue, Nov 22, 2022 at 02:38:53PM +0530, manish.mishra wrote: On 22/11/22 2:30 pm, Daniel P. Berrangé wrote: On Sat, Nov 19, 2022 at 09:36:14AM +, manish.mishra wrote: MSG_PEEK reads from the peek of channel, The data is treated as unread and the next read shall still return this data. This support is currently added only for socket class. Extra parameter 'flags' is added to io_readv calls to pass extra read flags like MSG_PEEK. Suggested-by: Daniel P. Berrangé --- chardev/char-socket.c | 4 +- include/io/channel.h| 83 + io/channel-buffer.c | 1 + io/channel-command.c| 1 + io/channel-file.c | 1 + io/channel-null.c | 1 + io/channel-socket.c | 16 +- io/channel-tls.c| 1 + io/channel-websock.c| 1 + io/channel.c| 73 +++-- migration/channel-block.c | 1 + scsi/qemu-pr-helper.c | 2 +- tests/qtest/tpm-emu.c | 2 +- tests/unit/test-io-channel-socket.c | 1 + util/vhost-user-server.c| 2 +- 15 files changed, 179 insertions(+), 11 deletions(-) diff --git a/io/channel-socket.c b/io/channel-socket.c index b76dca9cc1..a06b24766d 100644 --- a/io/channel-socket.c +++ b/io/channel-socket.c @@ -406,6 +406,8 @@ qio_channel_socket_accept(QIOChannelSocket *ioc, } #endif /* WIN32 */ +qio_channel_set_feature(QIO_CHANNEL(cioc), QIO_CHANNEL_FEATURE_READ_MSG_PEEK); + This covers the incoming server side socket. This also needs to be set in outgoing client side socket in qio_channel_socket_connect_async Yes sorry, i considered only current use-case, but as it is generic one both should be there. Thanks will update it. @@ -705,7 +718,6 @@ static ssize_t qio_channel_socket_writev(QIOChannel *ioc, } #endif /* WIN32 */ - #ifdef QEMU_MSG_ZEROCOPY static int qio_channel_socket_flush(QIOChannel *ioc, Error **errp) Please remove this unrelated whitespace change. @@ -109,6 +117,37 @@ int qio_channel_readv_all_eof(QIOChannel *ioc, return qio_channel_readv_full_all_eof(ioc, iov, niov, NULL, NULL, errp); } +int qio_channel_readv_peek_all_eof(QIOChannel *ioc, + const struct iovec *iov, + size_t niov, + Error **errp) +{ + ssize_t len = 0; + ssize_t total = iov_size(iov, niov); + + while (len < total) { + len = qio_channel_readv_full(ioc, iov, niov, NULL, +NULL, QIO_CHANNEL_READ_FLAG_MSG_PEEK, errp); + + if (len == QIO_CHANNEL_ERR_BLOCK) { +if (qemu_in_coroutine()) { +qio_channel_yield(ioc, G_IO_IN); +} else { +qio_channel_wait(ioc, G_IO_IN); +} +continue; + } + if (len == 0) { + return 0; + } + if (len < 0) { + return -1; + } + } This will busy wait burning CPU where there is a read > 0 and < total. Daniel, i could use MSG_WAITALL too if that works but then we will lose opportunity to yield. Or if you have some other idea. How easy would this happen? Another alternative is we could just return the partial len to caller then we fallback to the original channel orders if it happens. And then if it mostly will never happen it'll behave merely the same as what we want. Well we're trying to deal with a bug where the slow and/or unreliable network causes channels to arrive in unexpected order. Given we know we're having network trouble, I wouldn't want to make more assumptions about things happening correctly. With regards, Daniel Peter, I have seen MSG_PEEK used in combination with MSG_WAITALL, but looks like even though chances are less it can still return partial data even with multiple retries for signal case, so is not full proof. *MSG_WAITALL *(since Linux 2.2) This flag requests that the operation block until the full request is satisfied. However, the call may still return less data than requested if a signal is caught, an error or disconnect occurs, or the next data to be received is of a different type than that returned. This flag has no effect for datagram sockets. Actual read ahead will be little hackish, so just confirming we all are in agreement to do actual read ahead and i can
[PULL 2/2] target/arm: Use signed quantity to represent VMSAv8-64 translation level
From: Ard Biesheuvel The LPA2 extension implements 52-bit virtual addressing for 4k and 16k translation granules, and for the former, this means an additional level of translation is needed. This means we start counting at -1 instead of 0 when doing a walk, and so 'level' is now a signed quantity, and should be typed as such. So turn it from uint32_t into int32_t. This avoids a level of -1 getting misinterpreted as being >= 3, and terminating a page table walk prematurely with a bogus output address. Cc: Peter Maydell Cc: Philippe Mathieu-Daudé Cc: Richard Henderson Signed-off-by: Ard Biesheuvel Reviewed-by: Peter Maydell Signed-off-by: Peter Maydell --- target/arm/ptw.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/target/arm/ptw.c b/target/arm/ptw.c index 8ca468d65bc..f812734bfb2 100644 --- a/target/arm/ptw.c +++ b/target/arm/ptw.c @@ -1172,7 +1172,7 @@ static bool get_phys_addr_lpae(CPUARMState *env, S1Translate *ptw, ARMCPU *cpu = env_archcpu(env); ARMMMUIdx mmu_idx = ptw->in_mmu_idx; bool is_secure = ptw->in_secure; -uint32_t level; +int32_t level; ARMVAParameters param; uint64_t ttbr; hwaddr descaddr, indexmask, indexmask_grainsize; @@ -1302,7 +1302,7 @@ static bool get_phys_addr_lpae(CPUARMState *env, S1Translate *ptw, */ uint32_t sl0 = extract32(tcr, 6, 2); uint32_t sl2 = extract64(tcr, 33, 1); -uint32_t startlevel; +int32_t startlevel; bool ok; /* SL2 is RES0 unless DS=1 & 4kb granule. */ -- 2.25.1
[PULL 0/2] target-arm queue
Hi; this pull request has a couple of fixes for bugs in the Arm page-table-walk code, which arrived in the last day or so. I'm sending this out now in the hope it might just sneak in before rc2 gets tagged, so the fixes can get more testing time before the 7.2 release; but if they don't make it then this should go into rc3. thanks -- PMM The following changes since commit 6d71357a3b651ec9db126e4862b77e13165427f5: rtl8139: honor large send MSS value (2022-11-21 09:28:43 -0500) are available in the Git repository at: https://git.linaro.org/people/pmaydell/qemu-arm.git tags/pull-target-arm-20221122 for you to fetch changes up to 15f8f4671afd22491ce99d28a296514717fead4f: target/arm: Use signed quantity to represent VMSAv8-64 translation level (2022-11-22 16:10:25 +) target-arm: * Fix broken 5-level pagetable handling * Fix debug accesses when EL2 is present Ard Biesheuvel (1): target/arm: Use signed quantity to represent VMSAv8-64 translation level Peter Maydell (1): target/arm: Don't do two-stage lookup if stage 2 is disabled target/arm/ptw.c | 11 ++- 1 file changed, 6 insertions(+), 5 deletions(-)
[PULL 1/2] target/arm: Don't do two-stage lookup if stage 2 is disabled
In get_phys_addr_with_struct(), we call get_phys_addr_twostage() if the CPU supports EL2. However, we don't check here that stage 2 is actually enabled. Instead we only check that inside get_phys_addr_twostage() to skip stage 2 translation. This means that even if stage 2 is disabled we still tell the stage 1 lookup to do its page table walks via stage 2. This works by luck for normal CPU accesses, but it breaks for debug accesses, which are used by the disassembler and also by semihosting file reads and writes, because the debug case takes a different code path inside S1_ptw_translate(). This means that setups that use semihosting for file loads are broken (a regression since 7.1, introduced in recent ptw refactoring), and that sometimes disassembly in debug logs reports "unable to read memory" rather than showing the guest insns. Fix the bug by hoisting the "is stage 2 enabled?" check up to get_phys_addr_with_struct(), so that we handle S2 disabled the same way we do the "no EL2" case, with a simple single stage lookup. Reported-by: Jens Wiklander Reviewed-by: Richard Henderson Signed-off-by: Peter Maydell Message-id: 20221121212404.1450382-1-peter.mayd...@linaro.org --- target/arm/ptw.c | 7 --- 1 file changed, 4 insertions(+), 3 deletions(-) diff --git a/target/arm/ptw.c b/target/arm/ptw.c index 9a6277d862f..8ca468d65bc 100644 --- a/target/arm/ptw.c +++ b/target/arm/ptw.c @@ -2612,8 +2612,8 @@ static bool get_phys_addr_twostage(CPUARMState *env, S1Translate *ptw, ret = get_phys_addr_with_struct(env, ptw, address, access_type, result, fi); -/* If S1 fails or S2 is disabled, return early. */ -if (ret || regime_translation_disabled(env, ARMMMUIdx_Stage2, is_secure)) { +/* If S1 fails, return early. */ +if (ret) { return ret; } @@ -2739,7 +2739,8 @@ static bool get_phys_addr_with_struct(CPUARMState *env, S1Translate *ptw, * Otherwise, a stage1+stage2 translation is just stage 1. */ ptw->in_mmu_idx = mmu_idx = s1_mmu_idx; -if (arm_feature(env, ARM_FEATURE_EL2)) { +if (arm_feature(env, ARM_FEATURE_EL2) && +!regime_translation_disabled(env, ARMMMUIdx_Stage2, is_secure)) { return get_phys_addr_twostage(env, ptw, address, access_type, result, fi); } -- 2.25.1
Re: [PATCH v3 1/2] io: Add support for MSG_PEEK for socket channel
On Tue, Nov 22, 2022 at 11:29:05AM -0500, Peter Xu wrote: > On Tue, Nov 22, 2022 at 11:10:18AM -0500, Peter Xu wrote: > > On Tue, Nov 22, 2022 at 09:01:59PM +0530, manish.mishra wrote: > > > > > > On 22/11/22 8:19 pm, Daniel P. Berrangé wrote: > > > > On Tue, Nov 22, 2022 at 09:41:01AM -0500, Peter Xu wrote: > > > > > On Tue, Nov 22, 2022 at 02:38:53PM +0530, manish.mishra wrote: > > > > > > On 22/11/22 2:30 pm, Daniel P. Berrangé wrote: > > > > > > > On Sat, Nov 19, 2022 at 09:36:14AM +, manish.mishra wrote: > > > > > > > > MSG_PEEK reads from the peek of channel, The data is treated as > > > > > > > > unread and the next read shall still return this data. This > > > > > > > > support is currently added only for socket class. Extra > > > > > > > > parameter > > > > > > > > 'flags' is added to io_readv calls to pass extra read flags like > > > > > > > > MSG_PEEK. > > > > > > > > > > > > > > > > Suggested-by: Daniel P. Berrangé > > > > > > > Signed-off-by: manish.mishra > > > > > > > > --- > > > > > > > >chardev/char-socket.c | 4 +- > > > > > > > >include/io/channel.h| 83 > > > > > > > > + > > > > > > > >io/channel-buffer.c | 1 + > > > > > > > >io/channel-command.c| 1 + > > > > > > > >io/channel-file.c | 1 + > > > > > > > >io/channel-null.c | 1 + > > > > > > > >io/channel-socket.c | 16 +- > > > > > > > >io/channel-tls.c| 1 + > > > > > > > >io/channel-websock.c| 1 + > > > > > > > >io/channel.c| 73 > > > > > > > > +++-- > > > > > > > >migration/channel-block.c | 1 + > > > > > > > >scsi/qemu-pr-helper.c | 2 +- > > > > > > > >tests/qtest/tpm-emu.c | 2 +- > > > > > > > >tests/unit/test-io-channel-socket.c | 1 + > > > > > > > >util/vhost-user-server.c| 2 +- > > > > > > > >15 files changed, 179 insertions(+), 11 deletions(-) > > > > > > > > > > > > > > > diff --git a/io/channel-socket.c b/io/channel-socket.c > > > > > > > > index b76dca9cc1..a06b24766d 100644 > > > > > > > > --- a/io/channel-socket.c > > > > > > > > +++ b/io/channel-socket.c > > > > > > > > @@ -406,6 +406,8 @@ qio_channel_socket_accept(QIOChannelSocket > > > > > > > > *ioc, > > > > > > > >} > > > > > > > >#endif /* WIN32 */ > > > > > > > > +qio_channel_set_feature(QIO_CHANNEL(cioc), > > > > > > > > QIO_CHANNEL_FEATURE_READ_MSG_PEEK); > > > > > > > > + > > > > > > > This covers the incoming server side socket. > > > > > > > > > > > > > > This also needs to be set in outgoing client side socket in > > > > > > > qio_channel_socket_connect_async > > > > > > > > > > > > Yes sorry, i considered only current use-case, but as it is generic > > > > > > one both should be there. Thanks will update it. > > > > > > > > > > > > > > @@ -705,7 +718,6 @@ static ssize_t > > > > > > > > qio_channel_socket_writev(QIOChannel *ioc, > > > > > > > >} > > > > > > > >#endif /* WIN32 */ > > > > > > > > - > > > > > > > >#ifdef QEMU_MSG_ZEROCOPY > > > > > > > >static int qio_channel_socket_flush(QIOChannel *ioc, > > > > > > > >Error **errp) > > > > > > > Please remove this unrelated whitespace change. > > > > > > > > > > > > > > > > > > > > > > @@ -109,6 +117,37 @@ int qio_channel_readv_all_eof(QIOChannel > > > > > > > > *ioc, > > > > > > > >return qio_channel_readv_full_all_eof(ioc, iov, niov, > > > > > > > > NULL, NULL, errp); > > > > > > > >} > > > > > > > > +int qio_channel_readv_peek_all_eof(QIOChannel *ioc, > > > > > > > > + const struct iovec *iov, > > > > > > > > + size_t niov, > > > > > > > > + Error **errp) > > > > > > > > +{ > > > > > > > > + ssize_t len = 0; > > > > > > > > + ssize_t total = iov_size(iov, niov); > > > > > > > > + > > > > > > > > + while (len < total) { > > > > > > > > + len = qio_channel_readv_full(ioc, iov, niov, NULL, > > > > > > > > +NULL, > > > > > > > > QIO_CHANNEL_READ_FLAG_MSG_PEEK, errp); > > > > > > > > + > > > > > > > > + if (len == QIO_CHANNEL_ERR_BLOCK) { > > > > > > > > +if (qemu_in_coroutine()) { > > > > > > > > +qio_channel_yield(ioc, G_IO_IN); > > > > > > > > +} else { > > > > > > > > +qio_channel_wait(ioc, G_IO_IN); > > > > > > > > +} > > > > > > > > +continue; > > > > > > > > + } > > > > > > > > + if (len == 0) { > > > > > > > > + return 0; > > > > > > > > + } > > > > > > > > + if (len < 0) { > > > > > > > > + return -1; > > > > > > > > + } > > > > > > > > + } > > > > > > > This will busy wait
Re: [PATCH v3 1/2] io: Add support for MSG_PEEK for socket channel
On Tue, Nov 22, 2022 at 11:10:18AM -0500, Peter Xu wrote: > On Tue, Nov 22, 2022 at 09:01:59PM +0530, manish.mishra wrote: > > > > On 22/11/22 8:19 pm, Daniel P. Berrangé wrote: > > > On Tue, Nov 22, 2022 at 09:41:01AM -0500, Peter Xu wrote: > > > > On Tue, Nov 22, 2022 at 02:38:53PM +0530, manish.mishra wrote: > > > > > On 22/11/22 2:30 pm, Daniel P. Berrangé wrote: > > > > > > On Sat, Nov 19, 2022 at 09:36:14AM +, manish.mishra wrote: > > > > > > > MSG_PEEK reads from the peek of channel, The data is treated as > > > > > > > unread and the next read shall still return this data. This > > > > > > > support is currently added only for socket class. Extra parameter > > > > > > > 'flags' is added to io_readv calls to pass extra read flags like > > > > > > > MSG_PEEK. > > > > > > > > > > > > > > Suggested-by: Daniel P. Berrangé > > > > > > Signed-off-by: manish.mishra > > > > > > > --- > > > > > > >chardev/char-socket.c | 4 +- > > > > > > >include/io/channel.h| 83 > > > > > > > + > > > > > > >io/channel-buffer.c | 1 + > > > > > > >io/channel-command.c| 1 + > > > > > > >io/channel-file.c | 1 + > > > > > > >io/channel-null.c | 1 + > > > > > > >io/channel-socket.c | 16 +- > > > > > > >io/channel-tls.c| 1 + > > > > > > >io/channel-websock.c| 1 + > > > > > > >io/channel.c| 73 > > > > > > > +++-- > > > > > > >migration/channel-block.c | 1 + > > > > > > >scsi/qemu-pr-helper.c | 2 +- > > > > > > >tests/qtest/tpm-emu.c | 2 +- > > > > > > >tests/unit/test-io-channel-socket.c | 1 + > > > > > > >util/vhost-user-server.c| 2 +- > > > > > > >15 files changed, 179 insertions(+), 11 deletions(-) > > > > > > > > > > > > > diff --git a/io/channel-socket.c b/io/channel-socket.c > > > > > > > index b76dca9cc1..a06b24766d 100644 > > > > > > > --- a/io/channel-socket.c > > > > > > > +++ b/io/channel-socket.c > > > > > > > @@ -406,6 +406,8 @@ qio_channel_socket_accept(QIOChannelSocket > > > > > > > *ioc, > > > > > > >} > > > > > > >#endif /* WIN32 */ > > > > > > > +qio_channel_set_feature(QIO_CHANNEL(cioc), > > > > > > > QIO_CHANNEL_FEATURE_READ_MSG_PEEK); > > > > > > > + > > > > > > This covers the incoming server side socket. > > > > > > > > > > > > This also needs to be set in outgoing client side socket in > > > > > > qio_channel_socket_connect_async > > > > > > > > > > Yes sorry, i considered only current use-case, but as it is generic > > > > > one both should be there. Thanks will update it. > > > > > > > > > > > > @@ -705,7 +718,6 @@ static ssize_t > > > > > > > qio_channel_socket_writev(QIOChannel *ioc, > > > > > > >} > > > > > > >#endif /* WIN32 */ > > > > > > > - > > > > > > >#ifdef QEMU_MSG_ZEROCOPY > > > > > > >static int qio_channel_socket_flush(QIOChannel *ioc, > > > > > > >Error **errp) > > > > > > Please remove this unrelated whitespace change. > > > > > > > > > > > > > > > > > > > @@ -109,6 +117,37 @@ int qio_channel_readv_all_eof(QIOChannel > > > > > > > *ioc, > > > > > > >return qio_channel_readv_full_all_eof(ioc, iov, niov, > > > > > > > NULL, NULL, errp); > > > > > > >} > > > > > > > +int qio_channel_readv_peek_all_eof(QIOChannel *ioc, > > > > > > > + const struct iovec *iov, > > > > > > > + size_t niov, > > > > > > > + Error **errp) > > > > > > > +{ > > > > > > > + ssize_t len = 0; > > > > > > > + ssize_t total = iov_size(iov, niov); > > > > > > > + > > > > > > > + while (len < total) { > > > > > > > + len = qio_channel_readv_full(ioc, iov, niov, NULL, > > > > > > > +NULL, > > > > > > > QIO_CHANNEL_READ_FLAG_MSG_PEEK, errp); > > > > > > > + > > > > > > > + if (len == QIO_CHANNEL_ERR_BLOCK) { > > > > > > > +if (qemu_in_coroutine()) { > > > > > > > +qio_channel_yield(ioc, G_IO_IN); > > > > > > > +} else { > > > > > > > +qio_channel_wait(ioc, G_IO_IN); > > > > > > > +} > > > > > > > +continue; > > > > > > > + } > > > > > > > + if (len == 0) { > > > > > > > + return 0; > > > > > > > + } > > > > > > > + if (len < 0) { > > > > > > > + return -1; > > > > > > > + } > > > > > > > + } > > > > > > This will busy wait burning CPU where there is a read > 0 and < > > > > > > total. > > > > > > > > > > > Daniel, i could use MSG_WAITALL too if that works but then we will > > > > > lose opportunity to yield. Or if you have some other idea. > > > > How easy would this happen? > > > > > > > >
Re: [PATCH for 7.2?] vhost: fix vq dirt bitmap syncing when vIOMMU is enabled
Hi, On 11/22/22 10:43, Michael S. Tsirkin wrote: > On Tue, Nov 22, 2022 at 11:01:11AM +0800, Jason Wang wrote: >> When vIOMMU is enabled, the vq->used_phys is actually the IOVA not >> GPA. So we need to translate it to GPA before the syncing otherwise we >> may hit the following crash since IOVA could be out of the scope of >> the GPA log size. This could be noted when using virtio-IOMMU with >> vhost using 1G memory. > Noted how exactly? What does "using 1G memory" mean? We hit the following assertion: qemu-system-x86_64: ../hw/virtio/vhost.c:85: vhost_dev_sync_region: Assertion `end / VHOST_LOG_CHUNK < dev->log_size' failed. On the last time vhost_get_log_size() is called it takes into account 2 regions when computing the log_size: qemu-system-x86_64: vhost_get_log_size region 0 last=0x9 updated log_size=0x3 qemu-system-x86_64: vhost_get_log_size region 1 last=0x3fff updated log_size=0x1000 so in vhost_migration_log() vhost_get_log_size(dev) returns 0x1000 In the test case, memory_region_sync_dirty_bitmap() gets called for mem-machine_mem, vga.vram (several times) and eventually on pc.bios. This latter is reponsible for the assertion: qemu-system-x86_64: vhost_log_sync calls sync_dirty_map on pc.bios for the full range qemu-system-x86_64: vhost_sync_dirty_bitmap calls vhost_dev_sync_region on region 0 qemu-system-x86_64: vhost_dev_sync_region end=0x9 < start=0xfffc qemu-system-x86_64: vhost_sync_dirty_bitmap calls vhost_dev_sync_region on region 1 qemu-system-x86_64: vhost_dev_sync_region end=0x3fff < start=0xfffc qemu-system-x86_64: vhost_sync_dirty_bitmap calls vhost_dev_sync_region on vq 0 <- qemu-system-x86_64: vhost_dev_sync_region pc.bios mfirst=0xfffc mlast=0x rfirst=0xf240 rlast=0xfa45 qemu-system-x86_64: vhost_dev_sync_region pc.bios end=0xfa45 VHOST_LOG_CHUNK=0x4 end/VHOST_LOG_CHUNK=0x3fff dev->log_size=0x1000 qemu-system-x86_64: ../hw/virtio/vhost.c:85: vhost_dev_sync_region: Assertion `end / VHOST_LOG_CHUNK < dev->log_size' failed. "using 1G memory": We hit the issue with a guest started with 1GB initial RAM. > >> Fixes: c471ad0e9bd46 ("vhost_net: device IOTLB support") >> Cc: qemu-sta...@nongnu.org >> Reported-by: Yalan Zhang >> Tested-by: Eric Auger >> Tested-by: Lei Yang >> Signed-off-by: Jason Wang >> --- >> hw/virtio/vhost.c | 65 --- >> 1 file changed, 45 insertions(+), 20 deletions(-) >> >> diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c >> index d1c4c20b8c..26b319f34e 100644 >> --- a/hw/virtio/vhost.c >> +++ b/hw/virtio/vhost.c >> @@ -106,11 +106,30 @@ static void vhost_dev_sync_region(struct vhost_dev >> *dev, >> } >> } >> >> +static bool vhost_dev_has_iommu(struct vhost_dev *dev) >> +{ >> +VirtIODevice *vdev = dev->vdev; >> + >> +/* >> + * For vhost, VIRTIO_F_IOMMU_PLATFORM means the backend support >> + * incremental memory mapping API via IOTLB API. For platform that >> + * does not have IOMMU, there's no need to enable this feature >> + * which may cause unnecessary IOTLB miss/update transactions. >> + */ >> +if (vdev) { >> +return virtio_bus_device_iommu_enabled(vdev) && >> +virtio_host_has_feature(vdev, VIRTIO_F_IOMMU_PLATFORM); >> +} else { >> +return false; >> +} >> +} >> + >> static int vhost_sync_dirty_bitmap(struct vhost_dev *dev, >> MemoryRegionSection *section, >> hwaddr first, >> hwaddr last) >> { >> +IOMMUTLBEntry iotlb; > why don't we move this inside the scope where it's used? > >> int i; >> hwaddr start_addr; >> hwaddr end_addr; >> @@ -132,13 +151,37 @@ static int vhost_sync_dirty_bitmap(struct vhost_dev >> *dev, >> } >> for (i = 0; i < dev->nvqs; ++i) { >> struct vhost_virtqueue *vq = dev->vqs + i; >> +hwaddr used_phys = vq->used_phys, used_size = vq->used_size; >> +hwaddr phys, s; > these two, too. > >> >> if (!vq->used_phys && !vq->used_size) { >> continue; >> } >> >> -vhost_dev_sync_region(dev, section, start_addr, end_addr, >> vq->used_phys, >> - range_get_last(vq->used_phys, vq->used_size)); >> +if (vhost_dev_has_iommu(dev)) { >> +while (used_size) { >> +rcu_read_lock(); >> +iotlb = address_space_get_iotlb_entry(dev->vdev->dma_as, >> + used_phys, >> + true, >> MEMTXATTRS_UNSPECIFIED); >> +rcu_read_unlock(); >> + >> +if (iotlb.target_as == NULL) { >> +return -EINVAL; > I am not sure how this can trigger. I don't like == NULL: > !iotlb.target_as is more succint. But a bigger question is how to > handle this. callers
Re: [PATCH v2] target/arm: Use signed quantity to represent VMSAv8-64 translation level
On Tue, 22 Nov 2022 at 15:55, Ard Biesheuvel wrote: > > The LPA2 extension implements 52-bit virtual addressing for 4k and 16k > translation granules, and for the former, this means an additional level > of translation is needed. This means we start counting at -1 instead of > 0 when doing a walk, and so 'level' is now a signed quantity, and should > be typed as such. So turn it from uint32_t into int32_t. > > This avoids a level of -1 getting misinterpreted as being >= 3, and > terminating a page table walk prematurely with a bogus output address. > > Cc: Peter Maydell > Cc: Philippe Mathieu-Daudé > Cc: Richard Henderson > Signed-off-by: Ard Biesheuvel Applied to target-arm.next for 7.2, thanks. -- PMM
Re: [PATCH v3 1/2] io: Add support for MSG_PEEK for socket channel
On Tue, Nov 22, 2022 at 09:01:59PM +0530, manish.mishra wrote: > > On 22/11/22 8:19 pm, Daniel P. Berrangé wrote: > > On Tue, Nov 22, 2022 at 09:41:01AM -0500, Peter Xu wrote: > > > On Tue, Nov 22, 2022 at 02:38:53PM +0530, manish.mishra wrote: > > > > On 22/11/22 2:30 pm, Daniel P. Berrangé wrote: > > > > > On Sat, Nov 19, 2022 at 09:36:14AM +, manish.mishra wrote: > > > > > > MSG_PEEK reads from the peek of channel, The data is treated as > > > > > > unread and the next read shall still return this data. This > > > > > > support is currently added only for socket class. Extra parameter > > > > > > 'flags' is added to io_readv calls to pass extra read flags like > > > > > > MSG_PEEK. > > > > > > > > > > > > Suggested-by: Daniel P. Berrangé > > > > > Signed-off-by: manish.mishra > > > > > > --- > > > > > >chardev/char-socket.c | 4 +- > > > > > >include/io/channel.h| 83 > > > > > > + > > > > > >io/channel-buffer.c | 1 + > > > > > >io/channel-command.c| 1 + > > > > > >io/channel-file.c | 1 + > > > > > >io/channel-null.c | 1 + > > > > > >io/channel-socket.c | 16 +- > > > > > >io/channel-tls.c| 1 + > > > > > >io/channel-websock.c| 1 + > > > > > >io/channel.c| 73 > > > > > > +++-- > > > > > >migration/channel-block.c | 1 + > > > > > >scsi/qemu-pr-helper.c | 2 +- > > > > > >tests/qtest/tpm-emu.c | 2 +- > > > > > >tests/unit/test-io-channel-socket.c | 1 + > > > > > >util/vhost-user-server.c| 2 +- > > > > > >15 files changed, 179 insertions(+), 11 deletions(-) > > > > > > > > > > > diff --git a/io/channel-socket.c b/io/channel-socket.c > > > > > > index b76dca9cc1..a06b24766d 100644 > > > > > > --- a/io/channel-socket.c > > > > > > +++ b/io/channel-socket.c > > > > > > @@ -406,6 +406,8 @@ qio_channel_socket_accept(QIOChannelSocket *ioc, > > > > > >} > > > > > >#endif /* WIN32 */ > > > > > > +qio_channel_set_feature(QIO_CHANNEL(cioc), > > > > > > QIO_CHANNEL_FEATURE_READ_MSG_PEEK); > > > > > > + > > > > > This covers the incoming server side socket. > > > > > > > > > > This also needs to be set in outgoing client side socket in > > > > > qio_channel_socket_connect_async > > > > > > > > Yes sorry, i considered only current use-case, but as it is generic one > > > > both should be there. Thanks will update it. > > > > > > > > > > @@ -705,7 +718,6 @@ static ssize_t > > > > > > qio_channel_socket_writev(QIOChannel *ioc, > > > > > >} > > > > > >#endif /* WIN32 */ > > > > > > - > > > > > >#ifdef QEMU_MSG_ZEROCOPY > > > > > >static int qio_channel_socket_flush(QIOChannel *ioc, > > > > > >Error **errp) > > > > > Please remove this unrelated whitespace change. > > > > > > > > > > > > > > > > @@ -109,6 +117,37 @@ int qio_channel_readv_all_eof(QIOChannel *ioc, > > > > > >return qio_channel_readv_full_all_eof(ioc, iov, niov, NULL, > > > > > > NULL, errp); > > > > > >} > > > > > > +int qio_channel_readv_peek_all_eof(QIOChannel *ioc, > > > > > > + const struct iovec *iov, > > > > > > + size_t niov, > > > > > > + Error **errp) > > > > > > +{ > > > > > > + ssize_t len = 0; > > > > > > + ssize_t total = iov_size(iov, niov); > > > > > > + > > > > > > + while (len < total) { > > > > > > + len = qio_channel_readv_full(ioc, iov, niov, NULL, > > > > > > +NULL, > > > > > > QIO_CHANNEL_READ_FLAG_MSG_PEEK, errp); > > > > > > + > > > > > > + if (len == QIO_CHANNEL_ERR_BLOCK) { > > > > > > +if (qemu_in_coroutine()) { > > > > > > +qio_channel_yield(ioc, G_IO_IN); > > > > > > +} else { > > > > > > +qio_channel_wait(ioc, G_IO_IN); > > > > > > +} > > > > > > +continue; > > > > > > + } > > > > > > + if (len == 0) { > > > > > > + return 0; > > > > > > + } > > > > > > + if (len < 0) { > > > > > > + return -1; > > > > > > + } > > > > > > + } > > > > > This will busy wait burning CPU where there is a read > 0 and < total. > > > > > > > > > Daniel, i could use MSG_WAITALL too if that works but then we will lose > > > > opportunity to yield. Or if you have some other idea. > > > How easy would this happen? > > > > > > Another alternative is we could just return the partial len to caller then > > > we fallback to the original channel orders if it happens. And then if it > > > mostly will never happen it'll behave merely the same as what we want. > > Well we're trying to deal with a bug where the slow and/or unreliable
Re: [PATCH] vhost-vdpa: skip TPM CRB memory section
Hi Marc-André, On 11/22/22 15:53, marcandre.lur...@redhat.com wrote: > From: Marc-André Lureau > > 851d6d1a0f ("vfio/common: remove spurious tpm-crb-cmd misalignment > warning") removed the warning on vfio_listener_region_add() path. > > An error is reported for vhost-vdpa case: > qemu-kvm: vhost_vdpa_listener_region_add received unaligned region > > Skip the CRB device. > > Fixes: > https://bugzilla.redhat.com/show_bug.cgi?id=2141965 > > Signed-off-by: Marc-André Lureau Reviewed-by: Eric Auger Eric > --- > hw/virtio/vhost-vdpa.c | 6 ++ > 1 file changed, 6 insertions(+) > > diff --git a/hw/virtio/vhost-vdpa.c b/hw/virtio/vhost-vdpa.c > index 7468e44b87..9d7206e4b8 100644 > --- a/hw/virtio/vhost-vdpa.c > +++ b/hw/virtio/vhost-vdpa.c > @@ -19,6 +19,7 @@ > #include "hw/virtio/virtio-net.h" > #include "hw/virtio/vhost-shadow-virtqueue.h" > #include "hw/virtio/vhost-vdpa.h" > +#include "sysemu/tpm.h" > #include "exec/address-spaces.h" > #include "migration/blocker.h" > #include "qemu/cutils.h" > @@ -46,6 +47,11 @@ static bool > vhost_vdpa_listener_skipped_section(MemoryRegionSection *section, > { > Int128 llend; > > +if (TPM_IS_CRB(section->mr->owner)) { > +/* The CRB command buffer has its base address unaligned. */ > +return true; > +} > + > if ((!memory_region_is_ram(section->mr) && > !memory_region_is_iommu(section->mr)) || > memory_region_is_protected(section->mr) ||
[PATCH] target/riscv: Dump sstatus CSR in riscv_cpu_dump_state()
sstatus register dump is currently missing in riscv_cpu_dump_state(). Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1332 Signed-off-by: Bin Meng --- target/riscv/cpu.c | 1 + 1 file changed, 1 insertion(+) diff --git a/target/riscv/cpu.c b/target/riscv/cpu.c index d14e95c9dc..80d76f0181 100644 --- a/target/riscv/cpu.c +++ b/target/riscv/cpu.c @@ -382,6 +382,7 @@ static void riscv_cpu_dump_state(CPUState *cs, FILE *f, int flags) CSR_MHARTID, CSR_MSTATUS, CSR_MSTATUSH, +CSR_SSTATUS, CSR_HSTATUS, CSR_VSSTATUS, CSR_MIP, -- 2.34.1
Re: [PATCH] multifd: Updated QAPI format for 'migrate' qemu monitor command
On 22/11/22 2:53 pm, Daniel P. Berrangé wrote: On Mon, Nov 21, 2022 at 01:40:27PM +0100, Juan Quintela wrote: Het Gala wrote: To prevent double data encoding of uris, instead of passing transport mechanisms, host address and port all together in form of a single string and writing different parsing functions, we intend the user to explicitly mention most of the parameters through the medium of qmp command itself. The current patch is continuation of patchset series https://urldefense.proofpoint.com/v2/url?u=https-3A__www.mail-2Darchive.com_qemu-2Ddevel-40nongnu.org_msg901274.html=DwIBaQ=s883GpUCOChKOHiocYtGcg=c4KON2DiMd-szjwjggQcuUvTsPWblztAL0gVzaHnNmc=htOVZmDRCu27EhvYDu-28snN6GV0a6-ffX34joSwBkELLLVGHzRqD0LicZed3Xtw=xvzrWRBN4S5l3orPqu6di0MRq-gSGWZZU6-e7wpn8w4= and reply to the ongoing discussion for better QAPI design here https://urldefense.proofpoint.com/v2/url?u=https-3A__www.mail-2Darchive.com_qemu-2Ddevel-40nongnu.org_msg903753.html=DwIBaQ=s883GpUCOChKOHiocYtGcg=c4KON2DiMd-szjwjggQcuUvTsPWblztAL0gVzaHnNmc=htOVZmDRCu27EhvYDu-28snN6GV0a6-ffX34joSwBkELLLVGHzRqD0LicZed3Xtw=anLqZbhqa73P9SyPUaGk5q8R0SYR6IUtH_FWFML3lZY= . Suggested-by: Daniel P. Berrange Suggested-by: Aravind Retnakaran Suggested-by: Manish Mishra Signed-off-by: Het Gala +{ 'struct': 'MigrateChannel', + 'data' : { +'channeltype' : 'MigrateChannelType', +'*src-addr' : 'MigrateAddress', +'dest-addr' : 'MigrateAddress', Why do we want *both* addresses? This is part of their goal to have source based routing, so that traffic exits the src host via a particular NIC. I think this patch would be better if we split it into two parts. First introduce "MigrateChannel" struct with *only* the 'dest-addr' field, and only allow one element in the list, making 'uri' optional. Basically the first patch would *only* be about switching the 'migrate' command from using a plain string to a MigrateAddress structure. A second patch would then extend it to allow multiple MigrateChannel elements, to support different destinations for each channel. A third patch would then extend it to add src-addr to attain the source based routing. A fourth patch can then deprecate the existing 'uri' field. +'*multifd-count' : 'int' } } And if we are passing a list, why do we want to pass the real number? Yeah, technically I think this field is redundant, as you can just add many entires to the 'channels' list, using the same addresses. All this field does is allow a more compact JSON arg list. I'm not sure this is neccessary, unless we're expecting huge numbers of 'channels', and even then this isn't likely to be a performance issue. # -> { "execute": "migrate", "arguments": { "uri": "tcp:0:4446" } } # <- { "return": {} } # +# -> { "execute": "migrate", +# "arguments": { +# "channels": [ { 'channeltype': 'control', +# 'dest-addr': {'transport': 'socket', +#'type': 'inet', +#'host': '10.12.34.9', 'port': '1050'}}, +#{ 'channeltype': 'data', +# 'src-addr': {'transport': 'socket', +#'type': 'inet', +#'host': '10.12.34.9', +#'port': '4000', 'ipv4': 'true'}, +# 'dest-addr': { 'transport': 'socket', +# 'type': 'inet', +# 'host': '10.12.34.92', +# 'port': '1234', 'ipv4': 'true'}, +# 'multifd-count': 5 }, +#{ 'channeltype': 'data', +# 'src-addr': {'transport': 'socket', +#'type': 'inet', +#'host': '10.2.3.4', 'port': '1000'}, +# 'dest-addr': {'transport': 'socket', +# 'type': 'inet', +# 'host': '0.0.0.4', 'port': '3000'}, +# 'multifd-count': 3 } ] } } +# <- { "return": {} } +# ## { 'command': 'migrate', - 'data': {'uri': 'str', '*blk': 'bool', '*inc': 'bool', - '*detach': 'bool', '*resume': 'bool' } } + 'data': {'*uri': 'str', '*channels': ['MigrateChannel'], '*blk': 'bool', I think that "uri" bit should be dropped, right? No, it is required for back compatibility with existing clients. It should be marked deprecated, and removed several releases later, at which point 'chanels' can become mandatory instead of optional. + '*inc': 'bool', '*detach': 'bool', '*resume': 'bool' } } ## # @migrate-incoming: I can't see how to make the old one to work on top of this one (i.e. we would have to create strings from lists on QAPI, I think that is just too much). All we need is a piece of code
Re: Plugin Memory Callback Debugging
On Nov 21 18:22, Richard Henderson wrote: > On 11/21/22 13:51, Alex Bennée wrote: > > > > Aaron Lindsay writes: > > > > > On Nov 15 22:36, Alex Bennée wrote: > > > > Aaron Lindsay writes: > > > > > I believe the code *should* always reset `cpu->plugin_mem_cbs` to > > > > > NULL at the > > > > > end of an instruction/TB's execution, so its not exactly clear to me > > > > > how this > > > > > is occurring. However, I suspect it may be relevant that we are > > > > > calling > > > > > `free_dyn_cb_arr()` because my plugin called `qemu_plugin_reset()`. > > > > > > > > Hmm I'm going to have to remind myself about how this bit works. > > > > > > When is it expected that cpu->plugin_mem_cbs is reset to NULL if it is > > > set for an instruction? Is it guaranteed it is reset by the end of the > > > tb? > > > > It should be by the end of the instruction. See > > inject_mem_disable_helper() which inserts TCG code to disable the > > helpers. We also have plugin_gen_disable_mem_helpers() which should > > catch every exit out of a block (exit_tb, goto_tb, goto_ptr). That is > > why qemu_plugin_disable_mem_helpers() is only really concerned about > > when we longjmp out of the loop. > > > > > If I were to put an assertion in cpu_tb_exec() just after the call > > > to tcg_qemu_tb_exec(), should cpu->plugin_mem_cbs always be NULL > > > there? > > > > Yes I think so. > > Indeed. Well, the good news is that if this is an assumption we're relying on, it is now trivial to reproduce the problem! Compile some simple program (doesn't really matter, the issue gets triggered early): $ echo "int main() { return 0; }" > simple.c && gcc simple.c -o simple Make this change to cpu_tb_exec(): > diff --git a/accel/tcg/cpu-exec.c b/accel/tcg/cpu-exec.c > index 356fe348de..50a010327d 100644 > --- a/accel/tcg/cpu-exec.c > +++ b/accel/tcg/cpu-exec.c > @@ -436,6 +436,9 @@ cpu_tb_exec(CPUState *cpu, TranslationBlock *itb, int > *tb_exit) > > qemu_thread_jit_execute(); > ret = tcg_qemu_tb_exec(env, tb_ptr); > +if (cpu->plugin_mem_cbs != NULL) { > +g_assert_not_reached(); > +} > cpu->can_do_io = 1; > /* > * TODO: Delay swapping back to the read-write region of the TB And run: $ ./build/qemu-aarch64 -plugin contrib/plugins/libexeclog.so -d plugin ./simple You should fairly quickly see something like: > [snip] > 0, 0x5502814d04, 0xb482, "" > 0, 0x5502814d08, 0xf9400440, "", load, 0x5502844ed0 > 0, 0x5502814d0c, 0xf1001c1f, "" > ** > ERROR:../accel/tcg/cpu-exec.c:440:cpu_tb_exec: code should not be reached > Bail out! ERROR:../accel/tcg/cpu-exec.c:440:cpu_tb_exec: code should not be > reached When digging through my other failure in `rr` I saw the cpu->plugin_mem_cbs pointer changing from one non-null value to another (which also seems to indicate it is not being cleared between instructions). Does this hint that there are cases where reset cpu->plugin_mem_cbs to NULL is getting optimized away, but not the code to set it in the first place? -Aaron
[PATCH v2] target/arm: Use signed quantity to represent VMSAv8-64 translation level
The LPA2 extension implements 52-bit virtual addressing for 4k and 16k translation granules, and for the former, this means an additional level of translation is needed. This means we start counting at -1 instead of 0 when doing a walk, and so 'level' is now a signed quantity, and should be typed as such. So turn it from uint32_t into int32_t. This avoids a level of -1 getting misinterpreted as being >= 3, and terminating a page table walk prematurely with a bogus output address. Cc: Peter Maydell Cc: Philippe Mathieu-Daudé Cc: Richard Henderson Signed-off-by: Ard Biesheuvel --- target/arm/ptw.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/target/arm/ptw.c b/target/arm/ptw.c index 9a6277d862fac229..1d9bb4448761ddf4 100644 --- a/target/arm/ptw.c +++ b/target/arm/ptw.c @@ -1172,7 +1172,7 @@ static bool get_phys_addr_lpae(CPUARMState *env, S1Translate *ptw, ARMCPU *cpu = env_archcpu(env); ARMMMUIdx mmu_idx = ptw->in_mmu_idx; bool is_secure = ptw->in_secure; -uint32_t level; +int32_t level; ARMVAParameters param; uint64_t ttbr; hwaddr descaddr, indexmask, indexmask_grainsize; @@ -1302,7 +1302,7 @@ static bool get_phys_addr_lpae(CPUARMState *env, S1Translate *ptw, */ uint32_t sl0 = extract32(tcr, 6, 2); uint32_t sl2 = extract64(tcr, 33, 1); -uint32_t startlevel; +int32_t startlevel; bool ok; /* SL2 is RES0 unless DS=1 & 4kb granule. */ -- 2.35.1
Re: [PATCH] target/arm: Use signed quantity to represent VMSAv8-64 translation level
On Tue, 22 Nov 2022 at 14:21, Peter Maydell wrote: > > On Mon, 21 Nov 2022 at 19:02, Ard Biesheuvel wrote: > > > > On Mon, 21 Nov 2022 at 19:51, Peter Maydell > > wrote: > > > > > > On Mon, 21 Nov 2022 at 17:43, Ard Biesheuvel wrote: > > > > > > > > The LPA2 extension implements 52-bit virtual addressing for 4k and 16k > > > > translation granules, and for the former, this means an additional level > > > > of translation is needed. This means we start counting at -1 instead of > > > > 0 when doing a walk, and so 'level' is now a signed quantity, and should > > > > be typed as such. So turn it from uint32_t into int32_t. > > > > > > > > > > Does this cause any visible wrong behaviour, or is it just > > > a cleanup thing ? > > > > > > > No, 5 level paging is completely broken because of this, given that > > the 'level < 3' tests give the wrong result for (uint32_t)-1 > > Right, thanks. This seems like a bug worth fixing for 7.2. > Indeed. And the other patch I sent is needed too if you want to run with LPA2 'target/arm: Limit LPA2 effective output address when TCR.DS == 0' In case it is useful, I have a WIP kernel branch here which can be built with 52-bit virtual addressing for 4k or 16k pages. https://git.kernel.org/pub/scm/linux/kernel/git/ardb/linux.git/log/?h=arm64-4k-lpa2 > We should make 'uint32_t startlevel' also an int32_t > for consistency, I think, given that it is also sometimes > negative, though in that case it doesn't get used in any > comparisons so it's not going to cause wrong behaviour. > Indeed. I'll send a v2 and fold that in.
Re: [PATCH v2 1/2] cleanup: Tweak and re-run return_directly.cocci
BALATON Zoltan writes: > On Tue, 22 Nov 2022, Markus Armbruster wrote: >> Tweak the semantic patch to drop redundant parenthesis around the >> return expression. >> >> Coccinelle drops a comment in hw/rdma/vmw/pvrdma_cmd.c; restored >> manually. >> >> Coccinelle messes up vmdk_co_create(), not sure why. Change dropped, >> will be done manually in the next commit. >> >> Line breaks in target/avr/cpu.h and hw/rdma/vmw/pvrdma_cmd.c tidied up >> manually. >> >> Whitespace in tools/virtiofsd/fuse_lowlevel.c tidied up manually. >> >> checkpatch.pl complains "return of an errno should typically be -ve" >> two times for hw/9pfs/9p-synth.c. Preexisting, the patch merely makes >> it visible to checkpatch.pl. >> >> Signed-off-by: Markus Armbruster [...] >> diff --git a/hw/ppc/ppc4xx_sdram.c b/hw/ppc/ppc4xx_sdram.c >> index 8d7137faf3..54bf9a2b44 100644 >> --- a/hw/ppc/ppc4xx_sdram.c >> +++ b/hw/ppc/ppc4xx_sdram.c >> @@ -520,13 +520,10 @@ static inline hwaddr sdram_ddr2_base(uint32_t bcr) >> >> static hwaddr sdram_ddr2_size(uint32_t bcr) >> { >> -hwaddr size; >> int sh; >> >> sh = 1024 - ((bcr >> 6) & 0x3ff); >> -size = 8 * MiB * sh; >> - >> -return size; >> +return 8 * MiB * sh; >> } >> >> static uint32_t sdram_ddr2_dcr_read(void *opaque, int dcrn) > > There's also an sdram_ddr_size() that's similar and could be changed to > > return sh == 7 ? -1 : (4 * MiB) << sh; > > just to keep these two functions simliar but Coccinelle probably does not > catch that. Also while you're at it the assigmment of sh could be moved to > the declaration to save even more lines. As this then becomes more of a > handwritten patch, maybe it should be a separate patch cleaning these two > functions before the rest. I think it needs to be separate to keep me off Peter's naughty list ;) > Otherwise for this part (or separate patch as above): > > Reviewed-by: BALATON Zoltan Thanks!
Re: [PATCH v3 1/2] io: Add support for MSG_PEEK for socket channel
On 22/11/22 8:19 pm, Daniel P. Berrangé wrote: On Tue, Nov 22, 2022 at 09:41:01AM -0500, Peter Xu wrote: On Tue, Nov 22, 2022 at 02:38:53PM +0530, manish.mishra wrote: On 22/11/22 2:30 pm, Daniel P. Berrangé wrote: On Sat, Nov 19, 2022 at 09:36:14AM +, manish.mishra wrote: MSG_PEEK reads from the peek of channel, The data is treated as unread and the next read shall still return this data. This support is currently added only for socket class. Extra parameter 'flags' is added to io_readv calls to pass extra read flags like MSG_PEEK. Suggested-by: Daniel P. Berrangé --- chardev/char-socket.c | 4 +- include/io/channel.h| 83 + io/channel-buffer.c | 1 + io/channel-command.c| 1 + io/channel-file.c | 1 + io/channel-null.c | 1 + io/channel-socket.c | 16 +- io/channel-tls.c| 1 + io/channel-websock.c| 1 + io/channel.c| 73 +++-- migration/channel-block.c | 1 + scsi/qemu-pr-helper.c | 2 +- tests/qtest/tpm-emu.c | 2 +- tests/unit/test-io-channel-socket.c | 1 + util/vhost-user-server.c| 2 +- 15 files changed, 179 insertions(+), 11 deletions(-) diff --git a/io/channel-socket.c b/io/channel-socket.c index b76dca9cc1..a06b24766d 100644 --- a/io/channel-socket.c +++ b/io/channel-socket.c @@ -406,6 +406,8 @@ qio_channel_socket_accept(QIOChannelSocket *ioc, } #endif /* WIN32 */ +qio_channel_set_feature(QIO_CHANNEL(cioc), QIO_CHANNEL_FEATURE_READ_MSG_PEEK); + This covers the incoming server side socket. This also needs to be set in outgoing client side socket in qio_channel_socket_connect_async Yes sorry, i considered only current use-case, but as it is generic one both should be there. Thanks will update it. @@ -705,7 +718,6 @@ static ssize_t qio_channel_socket_writev(QIOChannel *ioc, } #endif /* WIN32 */ - #ifdef QEMU_MSG_ZEROCOPY static int qio_channel_socket_flush(QIOChannel *ioc, Error **errp) Please remove this unrelated whitespace change. @@ -109,6 +117,37 @@ int qio_channel_readv_all_eof(QIOChannel *ioc, return qio_channel_readv_full_all_eof(ioc, iov, niov, NULL, NULL, errp); } +int qio_channel_readv_peek_all_eof(QIOChannel *ioc, + const struct iovec *iov, + size_t niov, + Error **errp) +{ + ssize_t len = 0; + ssize_t total = iov_size(iov, niov); + + while (len < total) { + len = qio_channel_readv_full(ioc, iov, niov, NULL, +NULL, QIO_CHANNEL_READ_FLAG_MSG_PEEK, errp); + + if (len == QIO_CHANNEL_ERR_BLOCK) { +if (qemu_in_coroutine()) { +qio_channel_yield(ioc, G_IO_IN); +} else { +qio_channel_wait(ioc, G_IO_IN); +} +continue; + } + if (len == 0) { + return 0; + } + if (len < 0) { + return -1; + } + } This will busy wait burning CPU where there is a read > 0 and < total. Daniel, i could use MSG_WAITALL too if that works but then we will lose opportunity to yield. Or if you have some other idea. How easy would this happen? Another alternative is we could just return the partial len to caller then we fallback to the original channel orders if it happens. And then if it mostly will never happen it'll behave merely the same as what we want. Well we're trying to deal with a bug where the slow and/or unreliable network causes channels to arrive in unexpected order. Given we know we're having network trouble, I wouldn't want to make more assumptions about things happening correctly. With regards, Daniel Peter, I have seen MSG_PEEK used in combination with MSG_WAITALL, but looks like even though chances are less it can still return partial data even with multiple retries for signal case, so is not full proof. *MSG_WAITALL *(since Linux 2.2) This flag requests that the operation block until the full request is satisfied. However, the call may still return less data than requested if a signal is caught, an error or disconnect occurs, or the next data to be received is of a different type than that returned. This flag has no effect for datagram sockets. Actual read ahead will be little hackish, so just confirming we all are in agreement to do actual read ahead and i can send patch? :) Thanks Manish Mishra
Re: [PATCH for-8.0 v3 24/45] tcg: Use output_pref wrapper function
On 11/11/22 08:40, Richard Henderson wrote: We will shortly have the possibility of more that two outputs, though only for calls (for which preferences are moot). Avoid direct references to op->output_pref[] when possible. Signed-off-by: Richard Henderson --- include/tcg/tcg.h | 5 + tcg/tcg.c | 34 ++ 2 files changed, 23 insertions(+), 16 deletions(-) Reviewed-by: Philippe Mathieu-Daudé
Re: [PATCH for-8.0 v3 22/45] tci: MAX_OPC_PARAM_IARGS is no longer used
On 11/11/22 08:40, Richard Henderson wrote: Signed-off-by: Richard Henderson --- tcg/tci.c| 1 - tcg/tci/tcg-target.c.inc | 4 2 files changed, 5 deletions(-) Since commit 7b7d8b2d9a ("tcg/tci: Use ffi for calls")? Reviewed-by: Philippe Mathieu-Daudé
Re: [PATCH for-8.0 v3 21/45] accel/tcg/plugin: Use copy_op in append_{udata, mem}_cb
On 11/11/22 08:40, Richard Henderson wrote: Better to re-use the existing function for copying ops. Signed-off-by: Richard Henderson --- accel/tcg/plugin-gen.c | 16 1 file changed, 8 insertions(+), 8 deletions(-) Reviewed-by: Philippe Mathieu-Daudé
Re: [PATCH for-8.0 v3 20/45] accel/tcg/plugin: Avoid duplicate copy in copy_call
On 11/11/22 08:40, Richard Henderson wrote: We copied all of the arguments in copy_op_nocheck. We only need to replace the one argument that we change. Signed-off-by: Richard Henderson --- accel/tcg/plugin-gen.c | 2 -- 1 file changed, 2 deletions(-) Reviewed-by: Philippe Mathieu-Daudé
Re: [PATCH-for-7.2] vhost-vdpa: skip TPM CRB memory section
On 22/11/22 15:53, marcandre.lur...@redhat.com wrote: From: Marc-André Lureau 851d6d1a0f ("vfio/common: remove spurious tpm-crb-cmd misalignment warning") removed the warning on vfio_listener_region_add() path. An error is reported for vhost-vdpa case: qemu-kvm: vhost_vdpa_listener_region_add received unaligned region Skip the CRB device. Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=2141965 Signed-off-by: Marc-André Lureau --- hw/virtio/vhost-vdpa.c | 6 ++ 1 file changed, 6 insertions(+) diff --git a/hw/virtio/vhost-vdpa.c b/hw/virtio/vhost-vdpa.c index 7468e44b87..9d7206e4b8 100644 --- a/hw/virtio/vhost-vdpa.c +++ b/hw/virtio/vhost-vdpa.c @@ -19,6 +19,7 @@ #include "hw/virtio/virtio-net.h" #include "hw/virtio/vhost-shadow-virtqueue.h" #include "hw/virtio/vhost-vdpa.h" +#include "sysemu/tpm.h" #include "exec/address-spaces.h" #include "migration/blocker.h" #include "qemu/cutils.h" @@ -46,6 +47,11 @@ static bool vhost_vdpa_listener_skipped_section(MemoryRegionSection *section, { Int128 llend; +if (TPM_IS_CRB(section->mr->owner)) { +/* The CRB command buffer has its base address unaligned. */ +return true; +} Reviewed-by: Philippe Mathieu-Daudé
Re: [PATCH] cleanup: Tweak and re-run return_directly.cocci
On Tue, 22 Nov 2022 at 15:04, Philippe Mathieu-Daudé wrote: > > On 21/11/22 17:42, Max Filippov wrote: > > On Mon, Nov 21, 2022 at 6:01 AM Markus Armbruster wrote: > >> .../xtensa/core-dsp3400/xtensa-modules.c.inc | 136 +- > >> target/xtensa/core-lx106/xtensa-modules.c.inc | 16 +-- > > > > These files are generated and were imported from xtensa configuration > > overlays, they're not supposed to be changed. > > Tools can get the repository file list using 'git ls-files', which > itself support file pattern exclusion [*]. > > We can create i.e. 'scripts/imported-files.txt' with: > >linux-headers/ >target/hexagon/imported/ >target/xtensa/core* >tests/tcg/mips/user/ Good idea. scripts/clean-header-guards.pl also has these to add to the exclude list: include/standard-headers/ pc-bios/ tests/tcg/ tests/multiboot/ thanks -- PMM
Re: [PATCH v2 1/2] cleanup: Tweak and re-run return_directly.cocci
On 22/11/22 14:49, Markus Armbruster wrote: Tweak the semantic patch to drop redundant parenthesis around the return expression. Coccinelle drops a comment in hw/rdma/vmw/pvrdma_cmd.c; restored manually. Coccinelle messes up vmdk_co_create(), not sure why. Change dropped, will be done manually in the next commit. Line breaks in target/avr/cpu.h and hw/rdma/vmw/pvrdma_cmd.c tidied up manually. Whitespace in tools/virtiofsd/fuse_lowlevel.c tidied up manually. checkpatch.pl complains "return of an errno should typically be -ve" two times for hw/9pfs/9p-synth.c. Preexisting, the patch merely makes it visible to checkpatch.pl. Signed-off-by: Markus Armbruster --- scripts/coccinelle/return_directly.cocci | 5 +-- include/hw/pci/pci.h | 7 +-- target/avr/cpu.h | 4 +- hw/9pfs/9p-synth.c | 14 ++ hw/char/sifive_uart.c| 4 +- hw/ppc/ppc4xx_sdram.c| 5 +-- hw/rdma/vmw/pvrdma_cmd.c | 57 +--- hw/virtio/vhost-user.c | 6 +-- migration/dirtyrate.c| 10 + migration/tls.c | 6 +-- replay/replay-time.c | 5 +-- semihosting/console.c| 4 +- softmmu/memory.c | 11 + softmmu/physmem.c| 9 +--- target/loongarch/cpu.c | 4 +- target/mips/tcg/dsp_helper.c | 15 ++- target/riscv/debug.c | 6 +-- target/riscv/vector_helper.c | 28 +++- tests/bench/benchmark-crypto-akcipher.c | 6 +-- tests/qtest/erst-test.c | 5 +-- tests/qtest/hexloader-test.c | 6 +-- tests/qtest/pvpanic-pci-test.c | 6 +-- tests/qtest/pvpanic-test.c | 6 +-- tests/qtest/test-filter-mirror.c | 6 +-- tests/qtest/virtio-ccw-test.c| 6 +-- tests/tcg/multiarch/sha512.c | 9 +--- tools/virtiofsd/fuse_lowlevel.c | 24 +++--- 27 files changed, 70 insertions(+), 204 deletions(-) diff --git a/scripts/coccinelle/return_directly.cocci b/scripts/coccinelle/return_directly.cocci index 4cf50e75ea..6cb1b3c99a 100644 --- a/scripts/coccinelle/return_directly.cocci +++ b/scripts/coccinelle/return_directly.cocci @@ -11,9 +11,8 @@ identifier F; -T VAR; ... when != VAR --VAR = -+return - E; +-VAR = (E); -return VAR; ++return E; ... when != VAR } Reviewed-by: Philippe Mathieu-Daudé
Re: [PATCH v2 2/2] block/vmdk: Simplify vmdk_co_create() to return directly
On 22/11/22 14:49, Markus Armbruster wrote: Cc: Fam Zheng Cc: Kevin Wolf Cc: Hanna Reitz Cc: qemu-bl...@nongnu.org Signed-off-by: Markus Armbruster --- block/vmdk.c | 28 +++- 1 file changed, 11 insertions(+), 17 deletions(-) Reviewed-by: Philippe Mathieu-Daudé
Re: [PATCH] cleanup: Tweak and re-run return_directly.cocci
On 21/11/22 17:42, Max Filippov wrote: On Mon, Nov 21, 2022 at 6:01 AM Markus Armbruster wrote: .../xtensa/core-dsp3400/xtensa-modules.c.inc | 136 +- target/xtensa/core-lx106/xtensa-modules.c.inc | 16 +-- These files are generated and were imported from xtensa configuration overlays, they're not supposed to be changed. Tools can get the repository file list using 'git ls-files', which itself support file pattern exclusion [*]. We can create i.e. 'scripts/imported-files.txt' with: linux-headers/ target/hexagon/imported/ target/xtensa/core* tests/tcg/mips/user/ Then use 'git ls-files --exclude-from=scripts/imported-files.txt' ... [*] https://git-scm.com/docs/git-ls-files#Documentation/git-ls-files.txt---exclude-fromltfilegt
[PATCH] vhost-vdpa: skip TPM CRB memory section
From: Marc-André Lureau 851d6d1a0f ("vfio/common: remove spurious tpm-crb-cmd misalignment warning") removed the warning on vfio_listener_region_add() path. An error is reported for vhost-vdpa case: qemu-kvm: vhost_vdpa_listener_region_add received unaligned region Skip the CRB device. Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=2141965 Signed-off-by: Marc-André Lureau --- hw/virtio/vhost-vdpa.c | 6 ++ 1 file changed, 6 insertions(+) diff --git a/hw/virtio/vhost-vdpa.c b/hw/virtio/vhost-vdpa.c index 7468e44b87..9d7206e4b8 100644 --- a/hw/virtio/vhost-vdpa.c +++ b/hw/virtio/vhost-vdpa.c @@ -19,6 +19,7 @@ #include "hw/virtio/virtio-net.h" #include "hw/virtio/vhost-shadow-virtqueue.h" #include "hw/virtio/vhost-vdpa.h" +#include "sysemu/tpm.h" #include "exec/address-spaces.h" #include "migration/blocker.h" #include "qemu/cutils.h" @@ -46,6 +47,11 @@ static bool vhost_vdpa_listener_skipped_section(MemoryRegionSection *section, { Int128 llend; +if (TPM_IS_CRB(section->mr->owner)) { +/* The CRB command buffer has its base address unaligned. */ +return true; +} + if ((!memory_region_is_ram(section->mr) && !memory_region_is_iommu(section->mr)) || memory_region_is_protected(section->mr) || -- 2.38.1
Re: [PATCH] cleanup: Tweak and re-run return_directly.cocci
On 22/11/22 09:58, Markus Armbruster wrote: Thomas Huth writes: On 21/11/2022 17.32, Markus Armbruster wrote: Philippe Mathieu-Daudé writes: On 21/11/22 15:36, Peter Maydell wrote: On Mon, 21 Nov 2022 at 14:03, Markus Armbruster wrote: Tweak the semantic patch to drop redundant parenthesis around the return expression. Coccinelle drops comments in hw/rdma/vmw/pvrdma_cmd.c; restored manually. Coccinelle messes up vmdk_co_create(), not sure why. Transformed manually. Line breaks in target/avr/cpu.h and hw/rdma/vmw/pvrdma_cmd.c tidied up manually. Whitespace in fuse_reply_iov() tidied up manually. checkpatch.pl complains "return of an errno should typically be -ve" two times for hw/9pfs/9p-synth.c. Preexisting, the patch merely makes it visible to checkpatch.pl. checkpatch.pl complains "return is not a function, parentheses are not required" three times for target/mips/tcg/dsp_helper.c. False positives. Signed-off-by: Markus Armbruster .../user/ase/msa/bit-count/test_msa_nloc_b.c | 9 +- .../user/ase/msa/bit-count/test_msa_nloc_d.c | 9 +- [snip long list of other mips test files] 328 files changed, 989 insertions(+), 2099 deletions(-) This patch seems to almost entirely be huge because of these mips test case files. Are they specific to QEMU or are they effectively a 3rd-party import that it doesn't make sense to make local changes to? They are imported and will unlikely be modified. Not obvious to me from git-log. Should I drop the changes to tests/tcg/mips/? I'd say yes. At least move them to a separate patch. Possible status of tests/tcg/mips/: 1. Imported, should not be modified Drop from the patch. 2. Not imported, should be modified 2a. To be reviewed separately from the remainder of the patch Split off. 2b. Likewise, but nobody will care to review, realistically Split off and merge anyway, or drop. I'd go for the latter. 2c. To be reviewed together with the remainder of the patch Keep as is. Which one is it? "1. Imported, should not be modified" please :)
Re: [PATCH v3 1/2] io: Add support for MSG_PEEK for socket channel
On Tue, Nov 22, 2022 at 09:41:01AM -0500, Peter Xu wrote: > On Tue, Nov 22, 2022 at 02:38:53PM +0530, manish.mishra wrote: > > > > On 22/11/22 2:30 pm, Daniel P. Berrangé wrote: > > > On Sat, Nov 19, 2022 at 09:36:14AM +, manish.mishra wrote: > > > > MSG_PEEK reads from the peek of channel, The data is treated as > > > > unread and the next read shall still return this data. This > > > > support is currently added only for socket class. Extra parameter > > > > 'flags' is added to io_readv calls to pass extra read flags like > > > > MSG_PEEK. > > > > > > > > Suggested-by: Daniel P. Berrangé > > > Signed-off-by: manish.mishra > > > > --- > > > > chardev/char-socket.c | 4 +- > > > > include/io/channel.h| 83 + > > > > io/channel-buffer.c | 1 + > > > > io/channel-command.c| 1 + > > > > io/channel-file.c | 1 + > > > > io/channel-null.c | 1 + > > > > io/channel-socket.c | 16 +- > > > > io/channel-tls.c| 1 + > > > > io/channel-websock.c| 1 + > > > > io/channel.c| 73 +++-- > > > > migration/channel-block.c | 1 + > > > > scsi/qemu-pr-helper.c | 2 +- > > > > tests/qtest/tpm-emu.c | 2 +- > > > > tests/unit/test-io-channel-socket.c | 1 + > > > > util/vhost-user-server.c| 2 +- > > > > 15 files changed, 179 insertions(+), 11 deletions(-) > > > > > > > > > > diff --git a/io/channel-socket.c b/io/channel-socket.c > > > > index b76dca9cc1..a06b24766d 100644 > > > > --- a/io/channel-socket.c > > > > +++ b/io/channel-socket.c > > > > @@ -406,6 +406,8 @@ qio_channel_socket_accept(QIOChannelSocket *ioc, > > > > } > > > > #endif /* WIN32 */ > > > > +qio_channel_set_feature(QIO_CHANNEL(cioc), > > > > QIO_CHANNEL_FEATURE_READ_MSG_PEEK); > > > > + > > > This covers the incoming server side socket. > > > > > > This also needs to be set in outgoing client side socket in > > > qio_channel_socket_connect_async > > > > > > Yes sorry, i considered only current use-case, but as it is generic one > > both should be there. Thanks will update it. > > > > > > > > > @@ -705,7 +718,6 @@ static ssize_t qio_channel_socket_writev(QIOChannel > > > > *ioc, > > > > } > > > > #endif /* WIN32 */ > > > > - > > > > #ifdef QEMU_MSG_ZEROCOPY > > > > static int qio_channel_socket_flush(QIOChannel *ioc, > > > > Error **errp) > > > Please remove this unrelated whitespace change. > > > > > > > > > > @@ -109,6 +117,37 @@ int qio_channel_readv_all_eof(QIOChannel *ioc, > > > > return qio_channel_readv_full_all_eof(ioc, iov, niov, NULL, NULL, > > > > errp); > > > > } > > > > +int qio_channel_readv_peek_all_eof(QIOChannel *ioc, > > > > + const struct iovec *iov, > > > > + size_t niov, > > > > + Error **errp) > > > > +{ > > > > + ssize_t len = 0; > > > > + ssize_t total = iov_size(iov, niov); > > > > + > > > > + while (len < total) { > > > > + len = qio_channel_readv_full(ioc, iov, niov, NULL, > > > > +NULL, > > > > QIO_CHANNEL_READ_FLAG_MSG_PEEK, errp); > > > > + > > > > + if (len == QIO_CHANNEL_ERR_BLOCK) { > > > > +if (qemu_in_coroutine()) { > > > > +qio_channel_yield(ioc, G_IO_IN); > > > > +} else { > > > > +qio_channel_wait(ioc, G_IO_IN); > > > > +} > > > > +continue; > > > > + } > > > > + if (len == 0) { > > > > + return 0; > > > > + } > > > > + if (len < 0) { > > > > + return -1; > > > > + } > > > > + } > > > This will busy wait burning CPU where there is a read > 0 and < total. > > > > > > > Daniel, i could use MSG_WAITALL too if that works but then we will lose > > opportunity to yield. Or if you have some other idea. > > How easy would this happen? > > Another alternative is we could just return the partial len to caller then > we fallback to the original channel orders if it happens. And then if it > mostly will never happen it'll behave merely the same as what we want. Well we're trying to deal with a bug where the slow and/or unreliable network causes channels to arrive in unexpected order. Given we know we're having network trouble, I wouldn't want to make more assumptions about things happening correctly. With regards, Daniel -- |: https://berrange.com -o-https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o-https://fstop138.berrange.com :| |: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|
Re: [PATCH] target/arm: added cortex-a55 CPU support for qemu-virt
On Mon, 21 Nov 2022 at 15:08, Timofey Kutergin wrote: > > cortex-a55 is one of newer armv8.2+ CPUs supporting native > Privileged Access Never (PAN) feature. Using this CPU > provides access to this feature without using fictitious "max" > CPU. > > Signed-off-by: Timofey Kutergin Thanks; I've applied this to my target-arm-for-8.0 branch, which will eventually become target-arm when 7.2 is released in a couple of weeks. -- PMM
Re: [PATCH v3 1/2] io: Add support for MSG_PEEK for socket channel
On Tue, Nov 22, 2022 at 02:38:53PM +0530, manish.mishra wrote: > > On 22/11/22 2:30 pm, Daniel P. Berrangé wrote: > > On Sat, Nov 19, 2022 at 09:36:14AM +, manish.mishra wrote: > > > MSG_PEEK reads from the peek of channel, The data is treated as > > > unread and the next read shall still return this data. This > > > support is currently added only for socket class. Extra parameter > > > 'flags' is added to io_readv calls to pass extra read flags like > > > MSG_PEEK. > > > > > > Suggested-by: Daniel P. Berrangé > > Signed-off-by: manish.mishra > > > --- > > > chardev/char-socket.c | 4 +- > > > include/io/channel.h| 83 + > > > io/channel-buffer.c | 1 + > > > io/channel-command.c| 1 + > > > io/channel-file.c | 1 + > > > io/channel-null.c | 1 + > > > io/channel-socket.c | 16 +- > > > io/channel-tls.c| 1 + > > > io/channel-websock.c| 1 + > > > io/channel.c| 73 +++-- > > > migration/channel-block.c | 1 + > > > scsi/qemu-pr-helper.c | 2 +- > > > tests/qtest/tpm-emu.c | 2 +- > > > tests/unit/test-io-channel-socket.c | 1 + > > > util/vhost-user-server.c| 2 +- > > > 15 files changed, 179 insertions(+), 11 deletions(-) > > > > > > > diff --git a/io/channel-socket.c b/io/channel-socket.c > > > index b76dca9cc1..a06b24766d 100644 > > > --- a/io/channel-socket.c > > > +++ b/io/channel-socket.c > > > @@ -406,6 +406,8 @@ qio_channel_socket_accept(QIOChannelSocket *ioc, > > > } > > > #endif /* WIN32 */ > > > +qio_channel_set_feature(QIO_CHANNEL(cioc), > > > QIO_CHANNEL_FEATURE_READ_MSG_PEEK); > > > + > > This covers the incoming server side socket. > > > > This also needs to be set in outgoing client side socket in > > qio_channel_socket_connect_async > > > Yes sorry, i considered only current use-case, but as it is generic one both > should be there. Thanks will update it. > > > > > > @@ -705,7 +718,6 @@ static ssize_t qio_channel_socket_writev(QIOChannel > > > *ioc, > > > } > > > #endif /* WIN32 */ > > > - > > > #ifdef QEMU_MSG_ZEROCOPY > > > static int qio_channel_socket_flush(QIOChannel *ioc, > > > Error **errp) > > Please remove this unrelated whitespace change. > > > > > > > @@ -109,6 +117,37 @@ int qio_channel_readv_all_eof(QIOChannel *ioc, > > > return qio_channel_readv_full_all_eof(ioc, iov, niov, NULL, NULL, > > > errp); > > > } > > > +int qio_channel_readv_peek_all_eof(QIOChannel *ioc, > > > + const struct iovec *iov, > > > + size_t niov, > > > + Error **errp) > > > +{ > > > + ssize_t len = 0; > > > + ssize_t total = iov_size(iov, niov); > > > + > > > + while (len < total) { > > > + len = qio_channel_readv_full(ioc, iov, niov, NULL, > > > +NULL, > > > QIO_CHANNEL_READ_FLAG_MSG_PEEK, errp); > > > + > > > + if (len == QIO_CHANNEL_ERR_BLOCK) { > > > +if (qemu_in_coroutine()) { > > > +qio_channel_yield(ioc, G_IO_IN); > > > +} else { > > > +qio_channel_wait(ioc, G_IO_IN); > > > +} > > > +continue; > > > + } > > > + if (len == 0) { > > > + return 0; > > > + } > > > + if (len < 0) { > > > + return -1; > > > + } > > > + } > > This will busy wait burning CPU where there is a read > 0 and < total. > > > > Daniel, i could use MSG_WAITALL too if that works but then we will lose > opportunity to yield. Or if you have some other idea. How easy would this happen? Another alternative is we could just return the partial len to caller then we fallback to the original channel orders if it happens. And then if it mostly will never happen it'll behave merely the same as what we want. The thing is the other approach will be hacky in another way (have a flag migration_consumed_4_bytes_header to either main and multifd channels), then if it'll solve 99.99% cases I'd think it's good enough. Anyway we're working on a corner case already on unreliable network, and even if failure triggered it's not so bad - we just redo the migration. -- Peter Xu
Re: [PATCH for-8.0 07/29] accel/tcg: Honor atomicity of loads
On Fri, 18 Nov 2022 at 09:50, Richard Henderson wrote: > > Create ldst_atomicity.c.inc. > > Not required for user-only code loads, because we've ensured that > the page is read-only before beginning to translate code. > > Signed-off-by: Richard Henderson > +/** > + * required_atomicity: > + * > + * Return the lg2 bytes of atomicity required by @memop for @p. > + * If the operation must be split into two operations to be > + * examined separately for atomicity, return -lg2. > + */ > +static int required_atomicity(CPUArchState *env, uintptr_t p, MemOp memop) > +{ > +int atmax = memop & MO_ATMAX_MASK; > +int size = memop & MO_SIZE; > +unsigned tmp; > + > +if (atmax == MO_ATMAX_SIZE) { > +atmax = size; > +} else { > +atmax >>= MO_ATMAX_SHIFT; > +} > + > +switch (memop & MO_ATOM_MASK) { > +case MO_ATOM_IFALIGN: > +tmp = (1 << atmax) - 1; > +if (p & tmp) { > +return MO_8; > +} > +break; > +case MO_ATOM_NONE: > +return MO_8; > +case MO_ATOM_SUBALIGN: > +tmp = p & -p; > +if (tmp != 0 && tmp < atmax) { > +atmax = tmp; > +} > +break; > +case MO_ATOM_WITHIN16: > +tmp = p & 15; > +if (tmp + (1 << size) <= 16) { > +atmax = size; > +} else if (atmax < size && tmp + (1 << atmax) != 16) { > +/* > + * Paired load/store, where the pairs aren't aligned. > + * One of the two must still be handled atomically. > + */ > +atmax = -atmax; > +} > +break; > +default: > +g_assert_not_reached(); > +} > + > +/* > + * Here we have the architectural atomicity of the operation. > + * However, when executing in a serial context, we need no extra > + * host atomicity in order to avoid racing. This reduction > + * avoids looping with cpu_loop_exit_atomic. > + */ > +if (cpu_in_serial_context(env_cpu(env))) { Is it OK to use cpu_in_serial_context() here ? Even if there's no other vCPU executing in parallel, there might be device model code doing a memory write in the iothread, I think. > +return MO_8; > +} > +return atmax; > +} thanks -- PMM