On Wed, Jan 17, 2024 at 11:29:08AM +0100, Eric Auger wrote: > Hi Peter, Hi, Eric,
Thanks for the reviews! > > On 1/17/24 10:15, pet...@redhat.com wrote: > > From: Peter Xu <pet...@redhat.com> > > > > There're issue reported that when syetem_reset the VM with an intel iommu > system_reset > > device and MT2892 PF(mlx5_core driver), the host kernel throws DMAR error. > > > > https://issues.redhat.com/browse/RHEL-7188 > > > > Alex quickly spot a possible issue on ordering of device resets. > > > > It's verified by our QE team then that it is indeed the root cause of the > > problem. Consider when vIOMMU is reset before a VFIO device in a system > > reset: the device can be doing DMAs even if the vIOMMU is gone; in this > > specific context it means the shadow mapping can already be completely > > destroyed. Host will see these DMAs as malicious and report. > That's curious we did not get this earlier? I sincerely don't know. It could be that we just didn't test much on system resets. Or, we could have overlooked the host dmesgs; after all the error messages can be benign from functional pov. > > > > To fix it, we'll need to make sure all devices under the vIOMMU device > > hierachy will be reset before the vIOMMU itself. There's plenty of trick > > inside, one can get those by reading the last patch. > Not sure what you meant here ;-) I meant "how to make sure all the vIOMMU managed devices will be reset before the vIOMMU" is tricky on the implementation. I didn't reference any of those in the cover letter, because I think I stated mostly in patch 4, I want to reference that patch for the details. Since I think it's very tricky, I left that major comment in the code to persist. Thanks, -- Peter Xu