On Wed, Apr 29, 2015 at 07:51:21PM +1000, Alexey Kardashevskiy wrote:
> On 04/29/2015 02:18 PM, David Gibson wrote:
> >On Sat, Apr 25, 2015 at 10:14:42PM +1000, Alexey Kardashevskiy wrote:
> >>At the moment writing new TCE value to the IOMMU table fails with EBUSY
> >>if there is a valid entry already. However PAPR specification allows
> >>the guest to write new TCE value without clearing it first.
> >>
> >>Another problem this patch is addressing is the use of pool locks for
> >>external IOMMU users such as VFIO. The pool locks are to protect
> >>DMA page allocator rather than entries and since the host kernel does
> >>not control what pages are in use, there is no point in pool locks and
> >>exchange()+put_page(oldtce) is sufficient to avoid possible races.
> >>
> >>This adds an exchange() callback to iommu_table_ops which does the same
> >>thing as set() plus it returns replaced TCE and DMA direction so
> >>the caller can release the pages afterwards. The exchange() receives
> >>a physical address unlike set() which receives linear mapping address;
> >>and returns a physical address as the clear() does.
> >>
> >>This implements exchange() for P5IOC2/IODA/IODA2. This adds a requirement
> >>for a platform to have exchange() implemented in order to support VFIO.
> >>
> >>This replaces iommu_tce_build() and iommu_clear_tce() with
> >>a single iommu_tce_xchg().
> >>
> >>This makes sure that TCE permission bits are not set in TCE passed to
> >>IOMMU API as those are to be calculated by platform code from DMA direction.
> >>
> >>This moves SetPageDirty() to the IOMMU code to make it work for both
> >>VFIO ioctl interface in in-kernel TCE acceleration (when it becomes
> >>available later).
> >>
> >>Signed-off-by: Alexey Kardashevskiy <a...@ozlabs.ru>
> >>[aw: for the vfio related changes]
> >>Acked-by: Alex Williamson <alex.william...@redhat.com>
> >
> >This looks mostly good, but there are couple of details that need fixing.
> >
> 
> 
> [...]
> 
> >>diff --git a/arch/powerpc/platforms/powernv/pci.c 
> >>b/arch/powerpc/platforms/powernv/pci.c
> >>index ba75aa5..e8802ac 100644
> >>--- a/arch/powerpc/platforms/powernv/pci.c
> >>+++ b/arch/powerpc/platforms/powernv/pci.c
> >>@@ -598,6 +598,23 @@ int pnv_tce_build(struct iommu_table *tbl, long index, 
> >>long npages,
> >>    return 0;
> >>  }
> >>
> >>+#ifdef CONFIG_IOMMU_API
> >>+int pnv_tce_xchg(struct iommu_table *tbl, long index,
> >>+           unsigned long *tce, enum dma_data_direction *direction)
> >>+{
> >>+   u64 proto_tce = iommu_direction_to_tce_perm(*direction);
> >>+   unsigned long newtce = *tce | proto_tce;
> >>+   unsigned long idx = index - tbl->it_offset;
> >
> >Should this have a BUG_ON or WARN_ON if the supplied tce has bits set
> >below the page mask?
> 
> 
> Why? The caller checks these bits, do we really need to duplicate it
> here?

Because this is the crunch point where bad bits will actually cause
strange stuff to happen.

As much as anything the point of a BUG_ON would be to document that
this function expects the parameter to be aligned.

> 
> 
> >>+   *tce = xchg(pnv_tce(tbl, idx), cpu_to_be64(newtce));
> >>+   *tce = be64_to_cpu(*tce);
> >>+   *direction = iommu_tce_direction(*tce);
> >>+   *tce &= ~(TCE_PCI_READ | TCE_PCI_WRITE);
> >>+
> >>+   return 0;
> >>+}
> >>+#endif
> 
> 
> 

-- 
David Gibson                    | I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
                                | _way_ _around_!
http://www.ozlabs.org/~dgibson

Attachment: pgpc9KLDF6COu.pgp
Description: PGP signature

Reply via email to