On Tue, Mar 12, 2019 at 01:22:18PM +0100, Paolo Bonzini wrote: > On 12/03/19 13:14, Peter Xu wrote: > >> You're right. However, the DMAR FlatView only has three sections so I > >> suspect it's not a big deal if we keep it per-device. You'd still have > >> O(n) flatviews when the IOMMU is present and DMAR is enabled, but they > >> would have a constant number of sections so the cost overall is still > >> O(n) and not O(n^2). If the IOMMU is present but DMAR is disabled, all > >> VT-d address spaces would still share the same FlatView vtd_root_nodmar, > >> and that is where the performance loss happens. > >> > >> The final scheme would be same as above with vtd_root_dmar replaced by > >> vtd_root_dmar_%d. > > > > Just to make sure I understand - do you mean to keep the DMAR-enabled > > scenario as it is (I think it should be at least as slow as the perf > > number provided by Sergio) > > No, the DMAR-enabled scenario would not be (too) slow because the > vtd_root_dmar FlatView only has three sections. What makes things slows > is not sharing the vtd_root_nodmar FlatView, which has hundreds of sections.
Ah I see the point! > > > but we only fix up when guest doesn't > > enable DMAR? Asked since IIUC normally people won't specify "-device > > intel-iommu" if they don't need the IOMMU functionality at all (and > > DMAR is the major one), and if it's not specified then we should not > > suffer from the performance degradation after all? > > We would also suffer from performance degradation when using iommu=pt, > which is the common case for VFIO for example. You are right. And AFAIU iommu=pt should be used no matter what (IIRC people have discussed about making it the default) so indeed it can cover a lot of cases. > > > (But hmm... I think it could benefit users who only need IR/x2apic...) > > > > Btw, how do you think about my previous proposal? Do you think it's > > doable somehow? (Make IOMMUMemoryRegion to have only MR pointer that > > points to the shared MRs) > > I'm not sure, but it would certainly be too complicated for 4.0. Yeah if nodmar is the only bottleneck so far then I would completely agree with the approach. I'll draft something tomorrow and see whether it works. Thanks, -- Peter Xu