On Mon, Oct 17, 2016 at 10:47:02PM -0600, Alex Williamson wrote: > On Tue, 18 Oct 2016 15:06:55 +1100 > David Gibson <da...@gibson.dropbear.id.au> wrote: > > > On Mon, Oct 17, 2016 at 10:07:36AM -0600, Alex Williamson wrote: > > > On Mon, 17 Oct 2016 18:44:21 +0300 > > > "Aviv B.D" <bd.a...@gmail.com> wrote: > > > > > > > From: "Aviv Ben-David" <bd.a...@gmail.com> > > > > > > > > * Advertize Cache Mode capability in iommu cap register. > > > > This capability is controlled by "cache-mode" property of intel-iommu > > > > device. > > > > To enable this option call QEMU with "-device > > > > intel-iommu,cache-mode=true". > > > > > > > > * On page cache invalidation in intel vIOMMU, check if the domain > > > > belong to > > > > registered notifier, and notify accordingly. > > > > > > > > Currently this patch still doesn't enabling VFIO devices support with > > > > vIOMMU > > > > present. Current problems: > > > > * vfio_iommu_map_notify is not aware about memory range belong to > > > > specific > > > > VFIOGuestIOMMU. > > > > > > Could you elaborate on why this is an issue? > > > > > > > * memory_region_iommu_replay hangs QEMU on start up while it itterate > > > > over > > > > 64bit address space. Commenting out the call to this function enables > > > > workable VFIO device while vIOMMU present. > > > > > > This has been discussed previously, it would be incorrect for vfio not > > > to call the replay function. The solution is to add an iommu driver > > > callback to efficiently walk the mappings within a MemoryRegion. > > > > Right, replay is a bit of a hack. There are a couple of other > > approaches that might be adequate without a new callback: > > - Make the VFIOGuestIOMMU aware of the guest address range mapped > > by the vIOMMU. Intel currently advertises that as a full 64-bit > > address space, but I bet that's not actually true in practice. > > - Have the IOMMU MR advertise a (minimum) page size for vIOMMU > > mappings. That may let you stpe through the range with greater > > strides > > Hmm, VT-d supports at least a 39-bit address width and always supports > a minimum 4k page size, so yes that does reduce us from 2^52 steps down > to 2^27,
Right, which is probably doable, if not ideal > but it's still absurd to walk through the raw address space. Well.. it depends on the internal structure of the IOMMU. For Power, it's traditionally just a 1-level page table, so we can't actually do any better than stepping through each IOMMU page. > It does however seem correct to create the MemoryRegion with a width > that actually matches the IOMMU capability, but I don't think that's a > sufficient fix by itself. Thanks, I suspect it would actually make it workable in the short term. But I don't disagree that a "traverse" or "replay" callback of some sort in the iommu_ops is a better idea long term. Having a fallback to the current replay implementation if the callback isn't supplied seems pretty reasonable though. -- David Gibson | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/~dgibson
signature.asc
Description: PGP signature