On 1/17/24 10:19 AM, Matthew Rosato wrote: > On 1/17/24 6:01 AM, Cédric Le Goater wrote: >> Adding Alex, >> >> On 1/16/24 23:31, Matthew Rosato wrote: >>> ISM devices are sensitive to manipulation of the IOMMU, so the ISM device >>> needs to be reset before the vfio-pci device is reset (triggering a full >>> UNMAP). In order to ensure this occurs, trigger ISM device resets from >>> subsystem_reset before triggering the PCI bus reset (which will also >>> trigger vfio-pci reset). This only needs to be done for ISM devices >>> which were enabled for use by the guest. >>> Further, ensure that AIF is disabled as part of the reset event. >>> >>> Fixes: ef1535901a ("s390x: do a subsystem reset before the unprotect on >>> reboot") >>> Fixes: 03451953c7 ("s390x/pci: reset ISM passthrough devices on shutdown >>> and system reset") >>> Reported-by: Cédric Le Goater <c...@redhat.com> >>> Signed-off-by: Matthew Rosato <mjros...@linux.ibm.com> >>> --- >>> hw/s390x/s390-pci-bus.c | 26 +++++++++++++++++--------- >>> hw/s390x/s390-virtio-ccw.c | 2 ++ >>> include/hw/s390x/s390-pci-bus.h | 1 + >>> 3 files changed, 20 insertions(+), 9 deletions(-) >>> >>> diff --git a/hw/s390x/s390-pci-bus.c b/hw/s390x/s390-pci-bus.c >>> index 347580ebac..3e57d5faca 100644 >>> --- a/hw/s390x/s390-pci-bus.c >>> +++ b/hw/s390x/s390-pci-bus.c >>> @@ -151,20 +151,12 @@ static void s390_pci_shutdown_notifier(Notifier *n, >>> void *opaque) >>> pci_device_reset(pbdev->pdev); >>> } >>> -static void s390_pci_reset_cb(void *opaque) >>> -{ >>> - S390PCIBusDevice *pbdev = opaque; >>> - >>> - pci_device_reset(pbdev->pdev); >>> -} >>> - >>> static void s390_pci_perform_unplug(S390PCIBusDevice *pbdev) >>> { >>> HotplugHandler *hotplug_ctrl; >>> if (pbdev->pft == ZPCI_PFT_ISM) { >>> notifier_remove(&pbdev->shutdown_notifier); >>> - qemu_unregister_reset(s390_pci_reset_cb, pbdev); >>> } >>> /* Unplug the PCI device */ >>> @@ -1132,7 +1124,6 @@ static void s390_pcihost_plug(HotplugHandler >>> *hotplug_dev, DeviceState *dev, >>> if (pbdev->pft == ZPCI_PFT_ISM) { >>> pbdev->shutdown_notifier.notify = >>> s390_pci_shutdown_notifier; >>> >>> qemu_register_shutdown_notifier(&pbdev->shutdown_notifier); >>> - qemu_register_reset(s390_pci_reset_cb, pbdev); >>> } >>> } else { >>> pbdev->fh |= FH_SHM_EMUL; >>> @@ -1279,6 +1270,23 @@ static void s390_pci_enumerate_bridge(PCIBus *bus, >>> PCIDevice *pdev, >>> pci_default_write_config(pdev, PCI_SUBORDINATE_BUS, s->bus_no, 1); >>> } >>> +void s390_pci_ism_reset(void) >>> +{ >>> + S390pciState *s = s390_get_phb(); >>> + >>> + S390PCIBusDevice *pbdev, *next; >>> + >>> + /* Trigger reset event for each passthrough ISM device currently >>> in-use */ >>> + QTAILQ_FOREACH_SAFE(pbdev, &s->zpci_devs, link, next) { >>> + if (pbdev->interp && pbdev->pft == ZPCI_PFT_ISM && >>> + pbdev->fh & FH_MASK_ENABLE) { >>> + s390_pci_kvm_aif_disable(pbdev); >>> + >>> + pci_device_reset(pbdev->pdev); >>> + } >>> + } >>> +} >> >> >> Could we instead define a VFIOPCIDevice::resetfn handler for these >> ISM devices (1014:04ed) ? This would be cleaner if possible. >> >> If so, as a prerequisite, we would need to introduce in a little VFIO >> helper to define custom reset handlers. >> >> Thanks, >> >> C. >> > > Oh interesting, I had not noticed that. This may well work -- resetfn is > currently setup via vfio_setup_resetfn_quirk but it would probably be easier > to have a helper that takes the vdev and a function pointer so that we can > provide a platform-specific reset handler (rather than having > hw/vfio/pci-quirks.c worry about CONFIG_S390 etc). I'll have to play around > with this. > >
Hmm, it was a good idea but I don't think this will work. I tried to hack something together today but I'm definitely seeing paths where the vfio_listener_region_del happens before the call to vfio_pci_reset (which would ultimately trigger the new custom resetfn). Perhaps we should stick with the call from subsystem_reset -- it will ensure that the ISM cleanup happens after guest CPUs are stopped but before vfio does its cleanup.