On Fri, 7 Feb 2025 at 16:54, Peter Xu <pet...@redhat.com> wrote: > > On Thu, Feb 06, 2025 at 03:21:51PM +0100, Eric Auger wrote: > > This is a follow-up of Peter's attempt to fix the fact that > > vIOMMUs are likely to be reset before the device they protect: > > > > [PATCH 0/4] intel_iommu: Reset vIOMMU after all the rest of devices > > https://lore.kernel.org/all/20240117091559.144730-1-pet...@redhat.com/ > > > > This is especially observed with virtio devices when a qmp system_reset > > command is sent but also with VFIO devices. > > > > This series puts the vIOMMU reset in the 3-phase exit callback. > > > > This scheme was tested successful with virtio-devices and some > > VFIO devices. Nevertheless not all the topologies have been > > tested yet. > > Eric, > > It's great to know that we seem to be able to fix everything in such small > changeset! > > I would like to double check two things with you here: > > - For VFIO's reset hook, looks like we have landed more changes so that > vfio's reset function is now a TYPE_LEGACY_RESET, and it always do the > reset during "hold" phase only (via legacy_reset_hold()). That part > will make sure vIOMMU (if switching to exit()-only reset) will order > properly with VFIO. Is my understanding correct here?
Yes, we now do a reset of the whole system as a three-phase setup, and the old pre-three-phase reset APIs like qemu_register_reset() and device_class_set_legacy_reset() all happen during the "hold" phase. > - Is it possible if some PCIe devices that will provide its own > phase.exit(), would it matter on the order of PCIe device's > phase.exit() and vIOMMU's phase.exit() (if vIOMMUs switch to use > exit()-only approach like this one)? It's certainly possible for a PCIe device to implement a three-phase reset which does things in the exit phase. However I think I would say that such a device which didn't cancel all outstanding DMA operations during either 'enter' or 'hold' phases would be broken. If it did some other things during the 'exit' phase I don't think the ordering of those vs the iommu 'exit' handling should matter. (To some extent the splitting into three phases is trying to set up a consistent model as outlined in docs/devel/reset.rst and to some extent it's just a convenient way to get a basic "this reset thing I need to do must happen after some other device has done its reset things" which you can achieve by ad-hoc putting them in different phases. Ideally we get mostly the former and a little pragmatic dose of the latter, but the consistent model is not very solidly nailed down so I have a feeling the proportions may not be quite as lopsided as we'd like :-) ) thanks -- PMM