Re: [PATCH v2 3/7] iommu/vt-d: Expose ISA direct mapping region via iommu_get_resv_regions

2019-04-09 Thread Lu Baolu

Hi James,

On 4/6/19 2:02 AM, James Sewart wrote:

Hey Lu,

My bad, did some debugging on my end. The issue was swapping out
find_domain for iommu_get_domain_for_dev. It seems in some situations the
domain is not attached to the group but the device is expected to have the
domain still stored in its archdata.

I’ve attached the final patch with find_domain unremoved which seems to
work in my testing.



Just looked into your v3 patch set and some thoughts from my end posted
here just for your information.

Let me post the problems we want to address.

1. When allocating a new group for a device, how should we determine the
type of the default domain?

2. If we need to put a device into an existing group which uses a
different type of domain from what the device desires to use, we might
break the functionality of the device.

My new thought is letting the iommu generic code to determine the
default domain type (hence my proposed vendor specific default domain
type patches could be dropped).

If the default domain type is dynamical mapping, and later in 
iommu_no_mapping(), we determines that we must use an identity domain,

we then call iommu_request_dm_for_dev(dev).

If the default domain type is identity mapping, and later in
iommu_no_mapping(), we determined that we must use a dynamical domain,
we then call iommu_request_dma_domain_for_dev(dev).

We already have iommu_request_dm_for_dev() in iommu.c. We only need to
implement iommu_request_dma_domain_for_dev().

With this done, your patch titled "Create an IOMMU group for devices
that require an identity map" could also be dropped.

Any thoughts?


Cheers,
James.


Best regards,
Lu Baolu
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH 01/18] drivers core: Add I/O ASID allocator

2019-04-09 Thread Paul E. McKenney
On Tue, Apr 09, 2019 at 06:21:30PM +0300, Andriy Shevchenko wrote:
> On Tue, Apr 09, 2019 at 07:53:08AM -0700, Paul E. McKenney wrote:
> > On Tue, Apr 09, 2019 at 01:30:30PM +0300, Andriy Shevchenko wrote:
> > > On Tue, Apr 09, 2019 at 03:04:36AM -0700, Christoph Hellwig wrote:
> > > > On Tue, Apr 09, 2019 at 01:00:49PM +0300, Andriy Shevchenko wrote:
> > > > > I think it makes sense to add a helper macro to rcupdate.h
> > > > > (and we have several cases in kernel that can utilize it)
> > > > > 
> > > > > #define kfree_non_null_rcu(ptr, rcu_head) \
> > > > >   do {\
> > > > >   if (ptr)\
> > > > >   kfree_rcu(ptr, rcu_head);   \
> > > > >   } while (0)
> > > > > 
> > > > > as a more common pattern for resource deallocators.
> > > > 
> > > > I think that should move straight into kfree_rcu.  
> > > 
> > > Possible. I didn't dare to offer this due to lack of knowledge how it's 
> > > used in
> > > other places.
> > > 
> > > > In general
> > > > we expect *free* to deal with NULL pointers transparently, so we
> > > > should do so here as well.
> > > 
> > > Exactly my point, thanks.
> > 
> > As shown below?
> 
> Looks pretty much good to me, thanks!
> Reviewed-by: Andriy Shevchenko 

Applied, thank you!

Thanx, Paul

> > And now that you mention it, it is a bit surprising that no one has
> > complained before.  ;-)
> > 
> > Thanx, Paul
> > 
> > 
> > 
> > commit 23ad938244968e9d2a8001a1c52887c113b182f6
> > Author: Paul E. McKenney 
> > Date:   Tue Apr 9 07:48:18 2019 -0700
> > 
> > rcu: Make kfree_rcu() ignore NULL pointers
> > 
> > This commit makes the kfree_rcu() macro's semantics be consistent
> > with the likes of kfree() by adding a check for NULL pointers, so
> > that kfree_rcu(NULL, ...) is a no-op.
> > 
> > Reported-by: Andriy Shevchenko 
> > Reported-by: Christoph Hellwig 
> > Signed-off-by: Paul E. McKenney 
> > 
> > diff --git a/include/linux/rcupdate.h b/include/linux/rcupdate.h
> > index 922bb6848813..c68649b9bcec 100644
> > --- a/include/linux/rcupdate.h
> > +++ b/include/linux/rcupdate.h
> > @@ -828,9 +828,13 @@ static inline notrace void 
> > rcu_read_unlock_sched_notrace(void)
> >   * The BUILD_BUG_ON check must not involve any function calls, hence the
> >   * checks are done in macros here.
> >   */
> > -#define kfree_rcu(ptr, rcu_head)   \
> > -   __kfree_rcu(&((ptr)->rcu_head), offsetof(typeof(*(ptr)), rcu_head))
> > -
> > +#define kfree_rcu(ptr, rhf)
> > \
> > +do {   
> > \
> > +   typeof (ptr) ___p = (ptr);  \
> > +   \
> > +   if (___p)   \
> > +   __kfree_rcu(&((___p)->rhf), offsetof(typeof(*(ptr)), rhf)); \
> > +} while (0)
> >  
> >  /*
> >   * Place this after a lock-acquisition primitive to guarantee that
> > 
> 
> -- 
> With Best Regards,
> Andy Shevchenko
> 
> 

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Do we always need to reserve an iova before iommu_map?

2019-04-09 Thread Nicolin Chen
Hi all,

According to the routine of iommu_dma_alloc(), it allocates an iova
then does iommu_map() to map the iova to a physical address of new
allocated pages. However, in remoteproc_core.c, I see its code try
to iommu_map() without having an alloc_iova() or alloc_iova_fast().

Is it safe to do so? If an iova range is not allocated but mapped,
would a later iommu_dma_alloc() happen to hit this iova range when
doing its alloc_iova() and then fail to map?

And I am not very familiar with remoteproc code, so if I am missing
something, please kindly educate me.

Thanks
Nicolin
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v3 1/3] iommu: io-pgtable: Add ARM Mali midgard MMU page table format

2019-04-09 Thread Rob Herring
ARM Mali midgard GPU is similar to standard 64-bit stage 1 page tables, but
have a few differences. Add a new format type to represent the format. The
input address size is 48-bits and the output address size is 40-bits (and
possibly less?). Note that the later bifrost GPUs follow the standard
64-bit stage 1 format.

The differences in the format compared to 64-bit stage 1 format are:

The 3rd level page entry bits are 0x1 instead of 0x3 for page entries.

The access flags are not read-only and unprivileged, but read and write.
This is similar to stage 2 entries, but the memory attributes field matches
stage 1 being an index.

The nG bit is not set by the vendor driver. This one didn't seem to matter,
but we'll keep it aligned to the vendor driver.

Cc: Will Deacon 
Acked-by: Robin Murphy 
Cc: Joerg Roedel 
Cc: linux-arm-ker...@lists.infradead.org
Cc: iommu@lists.linux-foundation.org
Acked-by: Alyssa Rosenzweig 
Signed-off-by: Rob Herring 
---
This is really v4 of the patch. v3 is the series version.

Joerg, please ack so we can take this via the drm tree.

v3:
- Incorporated refactoring from Robin

 drivers/iommu/io-pgtable-arm.c | 91 ++
 drivers/iommu/io-pgtable.c |  1 +
 include/linux/io-pgtable.h |  7 +++
 3 files changed, 77 insertions(+), 22 deletions(-)

diff --git a/drivers/iommu/io-pgtable-arm.c b/drivers/iommu/io-pgtable-arm.c
index d3700ec15cbd..4e21efbc4459 100644
--- a/drivers/iommu/io-pgtable-arm.c
+++ b/drivers/iommu/io-pgtable-arm.c
@@ -172,6 +172,10 @@
 #define ARM_LPAE_MAIR_ATTR_IDX_CACHE   1
 #define ARM_LPAE_MAIR_ATTR_IDX_DEV 2

+#define ARM_MALI_LPAE_TTBR_ADRMODE_TABLE (3u << 0)
+#define ARM_MALI_LPAE_TTBR_READ_INNER  BIT(2)
+#define ARM_MALI_LPAE_TTBR_SHARE_OUTER BIT(4)
+
 /* IOPTE accessors */
 #define iopte_deref(pte,d) __va(iopte_to_paddr(pte, d))

@@ -180,11 +184,6 @@

 #define iopte_prot(pte)((pte) & ARM_LPAE_PTE_ATTR_MASK)

-#define iopte_leaf(pte,l)  \
-   (l == (ARM_LPAE_MAX_LEVELS - 1) ?   \
-   (iopte_type(pte,l) == ARM_LPAE_PTE_TYPE_PAGE) : \
-   (iopte_type(pte,l) == ARM_LPAE_PTE_TYPE_BLOCK))
-
 struct arm_lpae_io_pgtable {
struct io_pgtable   iop;

@@ -198,6 +197,15 @@ struct arm_lpae_io_pgtable {

 typedef u64 arm_lpae_iopte;

+static inline bool iopte_leaf(arm_lpae_iopte pte, int lvl,
+ enum io_pgtable_fmt fmt)
+{
+   if (lvl == (ARM_LPAE_MAX_LEVELS - 1) && fmt != ARM_MALI_LPAE)
+   return iopte_type(pte, lvl) == ARM_LPAE_PTE_TYPE_PAGE;
+
+   return iopte_type(pte, lvl) == ARM_LPAE_PTE_TYPE_BLOCK;
+}
+
 static arm_lpae_iopte paddr_to_iopte(phys_addr_t paddr,
 struct arm_lpae_io_pgtable *data)
 {
@@ -303,12 +311,14 @@ static void __arm_lpae_init_pte(struct 
arm_lpae_io_pgtable *data,
if (data->iop.cfg.quirks & IO_PGTABLE_QUIRK_ARM_NS)
pte |= ARM_LPAE_PTE_NS;

-   if (lvl == ARM_LPAE_MAX_LEVELS - 1)
+   if (data->iop.fmt != ARM_MALI_LPAE && lvl == ARM_LPAE_MAX_LEVELS - 1)
pte |= ARM_LPAE_PTE_TYPE_PAGE;
else
pte |= ARM_LPAE_PTE_TYPE_BLOCK;

-   pte |= ARM_LPAE_PTE_AF | ARM_LPAE_PTE_SH_IS;
+   if (data->iop.fmt != ARM_MALI_LPAE)
+   pte |= ARM_LPAE_PTE_AF;
+   pte |= ARM_LPAE_PTE_SH_IS;
pte |= paddr_to_iopte(paddr, data);

__arm_lpae_set_pte(ptep, pte, >iop.cfg);
@@ -321,7 +331,7 @@ static int arm_lpae_init_pte(struct arm_lpae_io_pgtable 
*data,
 {
arm_lpae_iopte pte = *ptep;

-   if (iopte_leaf(pte, lvl)) {
+   if (iopte_leaf(pte, lvl, data->iop.fmt)) {
/* We require an unmap first */
WARN_ON(!selftest_running);
return -EEXIST;
@@ -409,7 +419,7 @@ static int __arm_lpae_map(struct arm_lpae_io_pgtable *data, 
unsigned long iova,
__arm_lpae_sync_pte(ptep, cfg);
}

-   if (pte && !iopte_leaf(pte, lvl)) {
+   if (pte && !iopte_leaf(pte, lvl, data->iop.fmt)) {
cptep = iopte_deref(pte, data);
} else if (pte) {
/* We require an unmap first */
@@ -429,31 +439,37 @@ static arm_lpae_iopte arm_lpae_prot_to_pte(struct 
arm_lpae_io_pgtable *data,
if (data->iop.fmt == ARM_64_LPAE_S1 ||
data->iop.fmt == ARM_32_LPAE_S1) {
pte = ARM_LPAE_PTE_nG;
-
if (!(prot & IOMMU_WRITE) && (prot & IOMMU_READ))
pte |= ARM_LPAE_PTE_AP_RDONLY;
-
if (!(prot & IOMMU_PRIV))
pte |= ARM_LPAE_PTE_AP_UNPRIV;
-
-   if (prot & IOMMU_MMIO)
-   pte |= (ARM_LPAE_MAIR_ATTR_IDX_DEV
-   << ARM_LPAE_PTE_ATTRINDX_SHIFT);
-   else if (prot & IOMMU_CACHE)
-   pte |= (ARM_LPAE_MAIR_ATTR_IDX_CACHE
-   << 

[PATCH v3 0/3] Initial Panfrost driver

2019-04-09 Thread Rob Herring
Here's v3 of the panfrost driver. Lot's of changes from review comments
and further testing. Details are in each patch. Of note, a problem with
MMU page faults has been addressed improving the stability. In the
process, the TLB invalidate has been optimized which Tomeu says has
improved the performance some.

Several dependencies have been applied already, but the first 2 patches
are the remaining dependencies. We need to take the iommu change via
drm-misc or we need a stable branch.

I'm hoping this is the last version. I'm hoping to apply this to drm-misc
this week before -rc5 cutoff.

A git branch is here[1].

Rob

[1] git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux.git 
panfrost-rebase-v3


Rob Herring (3):
  iommu: io-pgtable: Add ARM Mali midgard MMU page table format
  drm: Add a drm_gem_objects_lookup helper
  drm/panfrost: Add initial panfrost driver

 MAINTAINERS  |   9 +
 drivers/gpu/drm/Kconfig  |   2 +
 drivers/gpu/drm/Makefile |   1 +
 drivers/gpu/drm/drm_gem.c|  93 ++-
 drivers/gpu/drm/panfrost/Kconfig |  14 +
 drivers/gpu/drm/panfrost/Makefile|  12 +
 drivers/gpu/drm/panfrost/TODO|  27 +
 drivers/gpu/drm/panfrost/panfrost_devfreq.c  | 218 
 drivers/gpu/drm/panfrost/panfrost_devfreq.h  |  14 +
 drivers/gpu/drm/panfrost/panfrost_device.c   | 252 +
 drivers/gpu/drm/panfrost/panfrost_device.h   | 124 
 drivers/gpu/drm/panfrost/panfrost_drv.c  | 460 +++
 drivers/gpu/drm/panfrost/panfrost_features.h | 309 ++
 drivers/gpu/drm/panfrost/panfrost_gem.c  |  95 
 drivers/gpu/drm/panfrost/panfrost_gem.h  |  29 +
 drivers/gpu/drm/panfrost/panfrost_gpu.c  | 362 
 drivers/gpu/drm/panfrost/panfrost_gpu.h  |  19 +
 drivers/gpu/drm/panfrost/panfrost_issues.h   | 176 ++
 drivers/gpu/drm/panfrost/panfrost_job.c  | 560 +++
 drivers/gpu/drm/panfrost/panfrost_job.h  |  51 ++
 drivers/gpu/drm/panfrost/panfrost_mmu.c  | 369 
 drivers/gpu/drm/panfrost/panfrost_mmu.h  |  17 +
 drivers/gpu/drm/panfrost/panfrost_regs.h | 298 ++
 drivers/iommu/io-pgtable-arm.c   |  91 ++-
 drivers/iommu/io-pgtable.c   |   1 +
 include/drm/drm_gem.h|   2 +
 include/linux/io-pgtable.h   |   7 +
 include/uapi/drm/panfrost_drm.h  | 142 +
 28 files changed, 3722 insertions(+), 32 deletions(-)
 create mode 100644 drivers/gpu/drm/panfrost/Kconfig
 create mode 100644 drivers/gpu/drm/panfrost/Makefile
 create mode 100644 drivers/gpu/drm/panfrost/TODO
 create mode 100644 drivers/gpu/drm/panfrost/panfrost_devfreq.c
 create mode 100644 drivers/gpu/drm/panfrost/panfrost_devfreq.h
 create mode 100644 drivers/gpu/drm/panfrost/panfrost_device.c
 create mode 100644 drivers/gpu/drm/panfrost/panfrost_device.h
 create mode 100644 drivers/gpu/drm/panfrost/panfrost_drv.c
 create mode 100644 drivers/gpu/drm/panfrost/panfrost_features.h
 create mode 100644 drivers/gpu/drm/panfrost/panfrost_gem.c
 create mode 100644 drivers/gpu/drm/panfrost/panfrost_gem.h
 create mode 100644 drivers/gpu/drm/panfrost/panfrost_gpu.c
 create mode 100644 drivers/gpu/drm/panfrost/panfrost_gpu.h
 create mode 100644 drivers/gpu/drm/panfrost/panfrost_issues.h
 create mode 100644 drivers/gpu/drm/panfrost/panfrost_job.c
 create mode 100644 drivers/gpu/drm/panfrost/panfrost_job.h
 create mode 100644 drivers/gpu/drm/panfrost/panfrost_mmu.c
 create mode 100644 drivers/gpu/drm/panfrost/panfrost_mmu.h
 create mode 100644 drivers/gpu/drm/panfrost/panfrost_regs.h
 create mode 100644 include/uapi/drm/panfrost_drm.h

--
2.19.1
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 12/21] dma-iommu: factor atomic pool allocations into helpers

2019-04-09 Thread Robin Murphy

On 27/03/2019 08:04, Christoph Hellwig wrote:

This keeps the code together and will simplify compiling the code
out on architectures that are always dma coherent.


And this is where things take a turn in the direction I just can't get 
on with - I'm looking at the final result and the twisty maze of little 
disjoint helpers all overlapping each other in functionality is really 
difficult to follow. And I would *much* rather have things rely on 
compile-time constant optimisation than spend the future having to fix 
the #ifdefed parts for arm64 whenever x86-centric changes fail to test them.


Conceptually, everything except the iommu_dma_alloc_remap() case is more 
or less just dma_direct_alloc() with an IOMMU mapping on top - if we 
could pass that an internal flag to say "don't fail or bounce because of 
masks" it seems like that approach could end up being quite a bit 
simpler (I did once have lofty plans to refactor the old arm64 code in 
such a way...)


Robin.



Signed-off-by: Christoph Hellwig 
---
  drivers/iommu/dma-iommu.c | 51 +--
  1 file changed, 38 insertions(+), 13 deletions(-)

diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
index 2013c650718a..8ec69176673d 100644
--- a/drivers/iommu/dma-iommu.c
+++ b/drivers/iommu/dma-iommu.c
@@ -673,6 +673,35 @@ static int iommu_dma_get_sgtable_remap(struct sg_table 
*sgt, void *cpu_addr,
GFP_KERNEL);
  }
  
+static void iommu_dma_free_pool(struct device *dev, size_t size,

+   void *vaddr, dma_addr_t dma_handle)
+{
+   __iommu_dma_unmap(iommu_get_domain_for_dev(dev), dma_handle, size);
+   dma_free_from_pool(vaddr, PAGE_ALIGN(size));
+}
+
+static void *iommu_dma_alloc_pool(struct device *dev, size_t size,
+   dma_addr_t *dma_handle, gfp_t gfp, unsigned long attrs)
+{
+   bool coherent = dev_is_dma_coherent(dev);
+   struct page *page;
+   void *vaddr;
+
+   vaddr = dma_alloc_from_pool(PAGE_ALIGN(size), , gfp);
+   if (!vaddr)
+   return NULL;
+
+   *dma_handle = __iommu_dma_map(dev, page_to_phys(page), size,
+   dma_info_to_prot(DMA_BIDIRECTIONAL, coherent, attrs),
+   iommu_get_domain_for_dev(dev));
+   if (*dma_handle == DMA_MAPPING_ERROR) {
+   dma_free_from_pool(vaddr, PAGE_ALIGN(size));
+   return NULL;
+   }
+
+   return vaddr;
+}
+
  static void iommu_dma_sync_single_for_cpu(struct device *dev,
dma_addr_t dma_handle, size_t size, enum dma_data_direction dir)
  {
@@ -981,21 +1010,18 @@ static void *iommu_dma_alloc(struct device *dev, size_t 
size,
 * get the virtually contiguous buffer we need by way of a
 * physically contiguous allocation.
 */
-   if (coherent) {
-   page = alloc_pages(gfp, get_order(size));
-   addr = page ? page_address(page) : NULL;
-   } else {
-   addr = dma_alloc_from_pool(size, , gfp);
-   }
-   if (!addr)
+   if (!coherent)
+   return iommu_dma_alloc_pool(dev, iosize, handle, gfp,
+   attrs);
+
+   page = alloc_pages(gfp, get_order(size));
+   if (!page)
return NULL;
  
+		addr = page_address(page);

*handle = __iommu_dma_map_page(dev, page, 0, iosize, ioprot);
if (*handle == DMA_MAPPING_ERROR) {
-   if (coherent)
-   __free_pages(page, get_order(size));
-   else
-   dma_free_from_pool(addr, size);
+   __free_pages(page, get_order(size));
addr = NULL;
}
} else if (attrs & DMA_ATTR_FORCE_CONTIGUOUS) {
@@ -1049,8 +1075,7 @@ static void iommu_dma_free(struct device *dev, size_t 
size, void *cpu_addr,
 * Hence how dodgy the below logic looks...
 */
if (dma_in_atomic_pool(cpu_addr, size)) {
-   __iommu_dma_unmap_page(dev, handle, iosize, 0, 0);
-   dma_free_from_pool(cpu_addr, size);
+   iommu_dma_free_pool(dev, size, cpu_addr, handle);
} else if (attrs & DMA_ATTR_FORCE_CONTIGUOUS) {
struct page *page = vmalloc_to_page(cpu_addr);
  


___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 18/18] iommu/vt-d: Add svm/sva invalidate function

2019-04-09 Thread Jacob Pan
On Tue, 9 Apr 2019 17:57:12 +0300
Andriy Shevchenko  wrote:

> On Mon, Apr 08, 2019 at 04:59:33PM -0700, Jacob Pan wrote:
> > When Shared Virtual Address (SVA) is enabled for a guest OS via
> > vIOMMU, we need to provide invalidation support at IOMMU API and
> > driver level. This patch adds Intel VT-d specific function to
> > implement iommu passdown invalidate API for shared virtual address.
> > 
> > The use case is for supporting caching structure invalidation
> > of assigned SVM capable devices. Emulated IOMMU exposes queue
> > invalidation capability and passes down all descriptors from the
> > guest to the physical IOMMU.
> > 
> > The assumption is that guest to host device ID mapping should be
> > resolved prior to calling IOMMU driver. Based on the device handle,
> > host IOMMU driver can replace certain fields before submit to the
> > invalidation queue.  
> 
> > +static int intel_iommu_sva_invalidate(struct iommu_domain *domain,
> > +   struct device *dev, struct
> > iommu_cache_invalidate_info *inv_info) +{
> > +   struct dmar_domain *dmar_domain = to_dmar_domain(domain);
> > +   struct device_domain_info *info;
> > +   struct intel_iommu *iommu;
> > +   unsigned long flags;
> > +   int cache_type;
> > +   u8 bus, devfn;
> > +   u16 did, sid;
> > +   int ret = 0;
> > +   u64 granu;
> > +   u64 size;
> > +
> > +   if (!inv_info || !dmar_domain ||
> > +   inv_info->version !=
> > IOMMU_CACHE_INVALIDATE_INFO_VERSION_1)
> > +   return -EINVAL;
> > +
> > +   iommu = device_to_iommu(dev, , );
> > +   if (!iommu)
> > +   return -ENODEV;
> > +  
> 
> > +   if (!dev || !dev_is_pci(dev))
> > +   return -ENODEV;  
> 
> How dev is used in above call? Can be dev NULL there optional and
> give non-NULL iommu?
> 
Good catch, dev cannot be NULL. I will move the check before
device_to_iommu().
> > +   switch (1 << cache_type) {  
> 
> BIT() ?
> 
> > +   case IOMMU_CACHE_INV_TYPE_IOTLB:
> > +   if (size && (inv_info->addr_info.addr &
> > ((1 << (VTD_PAGE_SHIFT + size)) - 1))) {  
> 
> BIT() ?
> 
Sounds good for the two BITs.

Thanks
> > +   pr_err("Address out of range,
> > 0x%llx, size order %llu\n",
> > +   inv_info->addr_info.addr,
> > size);
> > +   ret = -ERANGE;
> > +   goto out_unlock;
> > +   }
> > +
> > +   qi_flush_piotlb(iommu, did,
> > mm_to_dma_pfn(inv_info->addr_info.addr),
> > +   inv_info->addr_info.pasid,
> > +   size, granu);
> > +
> > +   /*
> > +* Always flush device IOTLB if ATS is
> > enabled since guest
> > +* vIOMMU exposes CM = 1, no device IOTLB
> > flush will be passed
> > +* down. REVISIT: cannot assume Linux guest
> > +*/
> > +   if (info->ats_enabled) {
> > +   qi_flush_dev_piotlb(iommu, sid,
> > info->pfsid,
> > +
> > inv_info->addr_info.pasid, info->ats_qdep,
> > +
> > inv_info->addr_info.addr, size,
> > +   granu);
> > +   }
> > +   break;
> > +   case IOMMU_CACHE_INV_TYPE_DEV_IOTLB:
> > +   if (info->ats_enabled) {
> > +   qi_flush_dev_piotlb(iommu, sid,
> > info->pfsid,
> > +
> > inv_info->addr_info.pasid, info->ats_qdep,
> > +
> > inv_info->addr_info.addr, size,
> > +   granu);
> > +   } else
> > +   pr_warn("Passdown device IOTLB
> > flush w/o ATS!\n"); +
> > +   break;
> > +   case IOMMU_CACHE_INV_TYPE_PASID:
> > +   qi_flush_pasid_cache(iommu, did, granu,
> > inv_info->pasid); +
> > +   break;
> > +   default:
> > +   dev_err(dev, "Unsupported IOMMU
> > invalidation type %d\n",
> > +   cache_type);
> > +   ret = -EINVAL;
> > +   }  
> 
> > +out_unlock:
> > +   spin_unlock(>lock);
> > +   spin_unlock_irqrestore(_domain_lock, flags);
> > +
> > +   return ret;
> > +}  
> 

[Jacob Pan]
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 08/18] iommu: Introduce cache_invalidate API

2019-04-09 Thread Andriy Shevchenko
On Tue, Apr 09, 2019 at 09:43:28AM -0700, Jacob Pan wrote:
> On Tue, 9 Apr 2019 13:07:18 +0300
> Andriy Shevchenko  wrote:
> > On Mon, Apr 08, 2019 at 04:59:23PM -0700, Jacob Pan wrote:

> > > +int iommu_cache_invalidate(struct iommu_domain *domain, struct
> > > device *dev,
> > > +struct iommu_cache_invalidate_info
> > > *inv_info) +{
> > > + int ret = 0;  
> > 
> > Redundant assignment.
> > 
> I am not a security expert but initialization of local variable can be
> more secure.
> I was looking at this talk.
> https://outflux.net/slides/2018/lss/danger.pdf
> https://cwe.mitre.org/data/definitions/457.html

I hardly see any of these applied to your case here.
Care to show what I'm missing?

> > > + if (unlikely(!domain->ops->cache_invalidate))
> > > + return -ENODEV;
> > > +
> > > + ret = domain->ops->cache_invalidate(domain, dev, inv_info);
> > > +
> > > + return ret;
> > > +}  

-- 
With Best Regards,
Andy Shevchenko


___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 07/21] dma-iommu: move the arm64 wrappers to common code

2019-04-09 Thread Robin Murphy

On 09/04/2019 18:23, Christoph Hellwig wrote:

On Tue, Apr 09, 2019 at 04:07:02PM +0100, Robin Murphy wrote:

-static inline int iommu_dma_init(void)
+static inline void iommu_setup_dma_ops(struct device *dev, u64 dma_base,
+   u64 size, const struct iommu_ops *ops)
   {
-   return 0;
   }


I don't think it makes sense to have a stub for that - AFAICS it should
only ever be called form arch code with an inherent "select IOMMU_DMA"
(much like the stuff which isn't stubbed currently).

Otherwise, I'm about 97% sure the rest of the move looks OK - thanks for
splitting things up.


arm64 only selects IOMMU_DMA if IOMMU_SUPPORT is selected, which can
be disabled.  So to keep some (unusual) arm64 configs compiling we'll need
the stub..


Urgh, right, it worked out before because arm64 stubbed its own caller 
of iommu_dma_init_domain() internally... Oh well - I guess there's no 
nicer alternative, and we have always treated arch_setup_dma_ops() that way.


Robin.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 09/21] dma-iommu: refactor iommu_dma_get_sgtable

2019-04-09 Thread Christoph Hellwig
On Tue, Apr 09, 2019 at 04:49:30PM +0100, Robin Murphy wrote:
>> *cpu_addr,
>> +size_t size)
>> +{
>> +unsigned int count = PAGE_ALIGN(size) >> PAGE_SHIFT;
>> +struct vm_struct *area = find_vm_area(cpu_addr);
>> +
>> +if (WARN_ON(!area || !area->pages))
>> +return -ENXIO;
>> +return sg_alloc_table_from_pages(sgt, area->pages, count, 0, size,
>> +GFP_KERNEL);
>> +}
>> +
>
> Is this complex enough to deserve being broken out? Really I'd prefer to 
> keep get_sgtable() as small and consolidated as possible so that it's that 
> much easier to delete in future :)

Well, it is logically separate, and this keeps it tidy.  But I agree
with the long term goal of killing off this awkward API that should
never have been added.

>
> I guess there is a certain symmetry with mmap(), so if that's the angle 
> you're dead set on, could we at least keep this guy down where 
> __iommu_dma_get_sgtable_page() was?

It is up there so that we can reduce the number of ifdef blocks for
dma remapping later on.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 08/21] dma-iommu: refactor iommu_dma_mmap

2019-04-09 Thread Christoph Hellwig
On Tue, Apr 09, 2019 at 04:29:07PM +0100, Robin Murphy wrote:
> On 27/03/2019 08:04, Christoph Hellwig wrote:
>> Move the vm_area handling into __iommu_dma_mmap, which is renamed
>> to iommu_dma_mmap_remap.
>>
>> Inline __iommu_dma_mmap_pfn into the main function to simplify the code
>> flow a bit.
>>
>> Signed-off-by: Christoph Hellwig 
>> ---
>>   drivers/iommu/dma-iommu.c | 50 ++-
>>   1 file changed, 18 insertions(+), 32 deletions(-)
>>
>> diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
>> index d14fe9f8c692..43bd3c7e0f6b 100644
>> --- a/drivers/iommu/dma-iommu.c
>> +++ b/drivers/iommu/dma-iommu.c
>> @@ -597,23 +597,27 @@ static struct page **__iommu_dma_alloc(struct device 
>> *dev, size_t size,
>>   }
>> /**
>> - * __iommu_dma_mmap - Map a buffer into provided user VMA
>> - * @pages: Array representing buffer from __iommu_dma_alloc()
>> + * iommu_dma_mmap_remap - Map a remapped page array into provided user VMA
>> + * @cpu_addr: virtual address of the memory to be remapped
>>* @size: Size of buffer in bytes
>>* @vma: VMA describing requested userspace mapping
>>*
>> - * Maps the pages of the buffer in @pages into @vma. The caller is 
>> responsible
>> + * Maps the pages pointed to by @cpu_addr into @vma. The caller is 
>> responsible
>>* for verifying the correct size and protection of @vma beforehand.
>>*/
>> -static int __iommu_dma_mmap(struct page **pages, size_t size,
>> +static int iommu_dma_mmap_remap(void *cpu_addr, size_t size,
>>  struct vm_area_struct *vma)
>>   {
>> +struct vm_struct *area = find_vm_area(cpu_addr);
>>  unsigned long uaddr = vma->vm_start;
>>  unsigned int i, count = PAGE_ALIGN(size) >> PAGE_SHIFT;
>>  int ret = -ENXIO;
>>   +  if (WARN_ON(!area || !area->pages))
>> +return -ENXIO;
>> +
>>  for (i = vma->vm_pgoff; i < count && uaddr < vma->vm_end; i++) {
>> -ret = vm_insert_page(vma, uaddr, pages[i]);
>> +ret = vm_insert_page(vma, uaddr, area->pages[i]);
>>  if (ret)
>>  break;
>>  uaddr += PAGE_SIZE;
>> @@ -1052,21 +1056,13 @@ static void iommu_dma_free(struct device *dev, 
>> size_t size, void *cpu_addr,
>>  }
>>   }
>>   -static int __iommu_dma_mmap_pfn(struct vm_area_struct *vma,
>> -  unsigned long pfn, size_t size)
>> -{
>> -return remap_pfn_range(vma, vma->vm_start, pfn + vma->vm_pgoff,
>> -   vma->vm_end - vma->vm_start,
>> -   vma->vm_page_prot);
>> -}
>> -
>>   static int iommu_dma_mmap(struct device *dev, struct vm_area_struct *vma,
>>  void *cpu_addr, dma_addr_t dma_addr, size_t size,
>>  unsigned long attrs)
>>   {
>>  unsigned long nr_pages = PAGE_ALIGN(size) >> PAGE_SHIFT;
>>  unsigned long off = vma->vm_pgoff;
>> -struct vm_struct *area;
>> +unsigned long pfn;
>>  int ret;
>>  vma->vm_page_prot = arch_dma_mmap_pgprot(dev, vma->vm_page_prot, 
>> attrs);
>> @@ -1077,25 +1073,15 @@ static int iommu_dma_mmap(struct device *dev, struct 
>> vm_area_struct *vma,
>>  if (off >= nr_pages || vma_pages(vma) > nr_pages - off)
>>  return -ENXIO;
>>   -  if (!is_vmalloc_addr(cpu_addr)) {
>> -unsigned long pfn = page_to_pfn(virt_to_page(cpu_addr));
>> -return __iommu_dma_mmap_pfn(vma, pfn, size);
>> -}
>> +if (is_vmalloc_addr(cpu_addr)) {
>> +if (!(attrs & DMA_ATTR_FORCE_CONTIGUOUS))
>> +return iommu_dma_mmap_remap(cpu_addr, size, vma);
>> +pfn = vmalloc_to_pfn(cpu_addr);
>> +} else
>> +pfn = page_to_pfn(virt_to_page(cpu_addr));
>
> Nit: braces around the else clause for correct style.

So I generally prefer that style too.  But in this case one of the
later patches will add an ifdef around the if { ... } else, and
adding braces would make that a lot more awkware.  There is another
occurance or two of this patter in this series elsewhere as well.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: revert dma direct internals abuse

2019-04-09 Thread Thomas Hellstrom via iommu
On Tue, 2019-04-09 at 17:25 +0200, h...@lst.de wrote:
> On Tue, Apr 09, 2019 at 02:17:40PM +, Thomas Hellstrom wrote:
> > If that's the case, I think most of the graphics drivers will stop
> > functioning. I don't think people would want that, and even if the
> > graphics drivers are "to blame" due to not implementing the sync
> > calls,
> > I think the work involved to get things right is impressive if at
> > all
> > possible.
> 
> Note that this only affects external, untrusted devices.  But that
> may include eGPU,

What about discrete graphics cards, like Radeon and Nvidia? Who gets to
determine what's trusted?

> so yes GPU folks finally need to up their game and
> stop thinking they are above the law^H^H^Hinterface.

And others doing user-space DMA. But I don't think a good way is to
break their drivers.

> 
> > There are two things that concerns me with dma_alloc_coherent:
> > 
> > 1) It seems to want pages mapped either in the kernel map or
> > vmapped.
> > Graphics drivers allocate huge amounts of memory, Typically up to
> > 50%
> > of system memory or more. In a 32 bit PAE system I'm afraid of
> > running
> > out of vmap space as well as not being able to allocate as much
> > memory
> > as I want. Perhaps a dma_alloc_coherent() interface that returns a
> > page
> > rather than a virtual address would do the trick.
> 
> We can't just simply export a page.  For devices that are not cache
> coherent we need to remap the returned memory to be uncached.  In the
> common cases that is either done by setting an uncached bit in the
> page
> tables, or by using a special address space alias.  So the virtual
> address to access the page matter, and we can't just kmap a random
> page an expect it to be coherent.  If you want memory that is not
> mapped into the kernel direct mapping and DMA to it you need to
> use the proper DMA streaming interface and obey their rules.

GPU libraries traditionally have been taking care of the CPU mapping
caching modes since the first AGP drivers. GPU MMU ptes commonly
support various caching options and pages are changing caching mode
dynamically.
So even if the DMA layer needs to do the remapping, couldn't we do that
on-demand when needed with a simple interface?

> 
> > 2) Exporting using dma-buf. A page allocated using
> > dma_alloc_coherent()
> > for one device might not be coherent for another device. What
> > happens
> > if I allocate a page using dma_alloc_coherent() for device 1 and
> > then
> > want to map it using dma_map_page() for device 2?
> 
> The problem in this case isn't really the coherency - once a page
> is mapped uncached it is 'coherent' for all devices, even those not
> requiring it.  The problem is addressability - the DMA address for
> the same memory might be different for different devices, and
> something
> that looks contigous to one device that is using an IOMMU might not
> for another one using the direct mapping.
> 
> We have the dma_get_sgtable API to map a piece of coherent memory
> using the streaming APIs for another device, but it has all sorts of
> problems.
> 
> That being said: your driver already uses the dma coherent API
> under various circumstances, so you already have the above issues.

Yes, but they are hidden behind driver options. We can't have someone
upgrade their kernel and suddenly things don't work anymore, That said,
I think the SWIOTLB case is rare enough for the below solution to be
acceptable, although the TTM check for the coherent page pool being
available still needs to remain.

Thanks,
Thomas


> 
> In the end swiotlb_nr_tbl() might be the best hint that some bounce
> buffering could happen.  This isn't proper use of the API, but at
> least a little better than your old intel_iommu_emabled check
> and much better than we we have right now.  Something like this:
> 
> diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_drv.c
> b/drivers/gpu/drm/vmwgfx/vmwgfx_drv.c
> index 6165fe2c4504..ff00bea026c5 100644
> --- a/drivers/gpu/drm/vmwgfx/vmwgfx_drv.c
> +++ b/drivers/gpu/drm/vmwgfx/vmwgfx_drv.c
> @@ -545,21 +545,6 @@ static void vmw_get_initial_size(struct
> vmw_private *dev_priv)
>   dev_priv->initial_height = height;
>  }
>  
> -/**
> - * vmw_assume_iommu - Figure out whether coherent dma-remapping
> might be
> - * taking place.
> - * @dev: Pointer to the struct drm_device.
> - *
> - * Return: true if iommu present, false otherwise.
> - */
> -static bool vmw_assume_iommu(struct drm_device *dev)
> -{
> - const struct dma_map_ops *ops = get_dma_ops(dev->dev);
> -
> - return !dma_is_direct(ops) && ops &&
> - ops->map_page != dma_direct_map_page;
> -}
> -
>  /**
>   * vmw_dma_select_mode - Determine how DMA mappings should be set up
> for this
>   * system.
> @@ -581,25 +566,14 @@ static int vmw_dma_select_mode(struct
> vmw_private *dev_priv)
>   [vmw_dma_map_populate] = "Keeping DMA mappings.",
>   [vmw_dma_map_bind] = "Giving up DMA mappings early."};
>  
> - if 

Re: [PATCH 07/21] dma-iommu: move the arm64 wrappers to common code

2019-04-09 Thread Christoph Hellwig
On Tue, Apr 09, 2019 at 04:07:02PM +0100, Robin Murphy wrote:
>> -static inline int iommu_dma_init(void)
>> +static inline void iommu_setup_dma_ops(struct device *dev, u64 dma_base,
>> +u64 size, const struct iommu_ops *ops)
>>   {
>> -return 0;
>>   }
>
> I don't think it makes sense to have a stub for that - AFAICS it should 
> only ever be called form arch code with an inherent "select IOMMU_DMA" 
> (much like the stuff which isn't stubbed currently).
>
> Otherwise, I'm about 97% sure the rest of the move looks OK - thanks for 
> splitting things up.

arm64 only selects IOMMU_DMA if IOMMU_SUPPORT is selected, which can
be disabled.  So to keep some (unusual) arm64 configs compiling we'll need
the stub..
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 07/21] dma-iommu: move the arm64 wrappers to common code

2019-04-09 Thread Christoph Hellwig
On Tue, Apr 09, 2019 at 04:07:02PM +0100, Robin Murphy wrote:
>> +static void iommu_dma_unmap_page(struct device *dev, dma_addr_t dma_handle,
>> +size_t size, enum dma_data_direction dir, unsigned long attrs)
>> +{
>> +if (!(attrs & DMA_ATTR_SKIP_CPU_SYNC))
>> +iommu_dma_sync_single_for_cpu(dev, dma_handle, size, dir);
>> +__iommu_dma_unmap(iommu_get_domain_for_dev(dev), dma_handle, size);
>
> That wants to be iommu_get_dma_domain() there to minimise the overhead. In 
> fact, I guess this could now be streamlined a bit in the style of 
> iommu_dma_map_page() above - i.e. avoid doing a second domain lookup in the 
> sync case - but that can happen later (if indeed you haven't already).

Yes, this should be iommu_get_dma_domain - this got lost during
a rebase to the kernel version that changed to the iommu_get_dma_domain
calls.

I don't think I've optimized to remove the additional call, but
I can easily throw in another patch to do that.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 04/21] dma-iommu: cleanup dma-iommu.h

2019-04-09 Thread Christoph Hellwig
On Fri, Apr 05, 2019 at 06:42:57PM +0100, Robin Murphy wrote:
> Other than introducing this unnecessary dupe,

Fixed.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 02/21] arm64/iommu: improve mmap bounds checking

2019-04-09 Thread Christoph Hellwig
On Tue, Apr 09, 2019 at 04:12:51PM +0100, Robin Murphy wrote:
> On 07/04/2019 07:59, Christoph Hellwig wrote:
>> On Fri, Apr 05, 2019 at 06:30:52PM +0100, Robin Murphy wrote:
>>> On 27/03/2019 08:04, Christoph Hellwig wrote:
 The nr_pages checks should be done for all mmap requests, not just those
 using remap_pfn_range.
>>>
>>> Hmm, the logic in iommu_dma_mmap() inherently returns an error for the "off
 = nr_pages" case already. It's also supposed to be robust against the
>>> "vma_pages(vma) > nr_pages - off" condition, although by making the partial
>>> mapping and treating it as a success, rather than doing nothing and
>>> returning an error. What's the exact motivation here?
>>
>> Have one error check at the front of the function that is identical
>> to the mmap checks in the other dma_map_ops instances so that:
>>
>>   a) we get the same error behavior for partial requests everywhere
>>   b) we can lift these checks into common code in the next round.
>>
>
> Fair enough, but in that case why isn't the dma_mmap_from_coherent() path 
> also covered?

dma_mmap_from_coherent currently duplicates those checks itself, and
because of that the other callers also don't include it in their
checks.  I don't actually like that situation and have patches to
refactor and clean up that whole mess by also moving the dma coherent
mmap to common code, and share the checks that I plan to also lift.

But for now I'm holding these back as they would conflict with this
series and I'm not sure if it will go in and if yes if that is through
the dma-mapping or iommu tree.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v2 5/7] iommu/arm-smmu-v3: Link domains and devices

2019-04-09 Thread Jean-Philippe Brucker
When removing a mapping from a domain, we need to send an invalidation to
all devices that might have stored it in their Address Translation Cache
(ATC). In addition when updating the context descriptor of a live domain,
we'll need to send invalidations for all devices attached to it.

Maintain a list of devices in each domain, protected by a spinlock. It is
updated every time we attach or detach devices to and from domains.

It needs to be a spinlock because we'll invalidate ATC entries from
within hardirq-safe contexts, but it may be possible to relax the read
side with RCU later.

Signed-off-by: Jean-Philippe Brucker 
---
 drivers/iommu/arm-smmu-v3.c | 21 -
 1 file changed, 20 insertions(+), 1 deletion(-)

diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c
index 7b425483f4b6..3e7198ee9530 100644
--- a/drivers/iommu/arm-smmu-v3.c
+++ b/drivers/iommu/arm-smmu-v3.c
@@ -581,6 +581,7 @@ struct arm_smmu_device {
 struct arm_smmu_master {
struct arm_smmu_device  *smmu;
struct arm_smmu_domain  *domain;
+   struct list_headdomain_head;
u32 *sids;
unsigned intnum_sids;
 };
@@ -607,6 +608,9 @@ struct arm_smmu_domain {
};
 
struct iommu_domain domain;
+
+   struct list_headdevices;
+   spinlock_t  devices_lock;
 };
 
 struct arm_smmu_option_prop {
@@ -1504,6 +1508,9 @@ static struct iommu_domain 
*arm_smmu_domain_alloc(unsigned type)
}
 
mutex_init(_domain->init_mutex);
+   INIT_LIST_HEAD(_domain->devices);
+   spin_lock_init(_domain->devices_lock);
+
return _domain->domain;
 }
 
@@ -1721,9 +1728,16 @@ static void arm_smmu_install_ste_for_dev(struct 
arm_smmu_master *master)
 
 static void arm_smmu_detach_dev(struct arm_smmu_master *master)
 {
-   if (!master->domain)
+   unsigned long flags;
+   struct arm_smmu_domain *smmu_domain = master->domain;
+
+   if (!smmu_domain)
return;
 
+   spin_lock_irqsave(_domain->devices_lock, flags);
+   list_del(>domain_head);
+   spin_unlock_irqrestore(_domain->devices_lock, flags);
+
master->domain = NULL;
arm_smmu_install_ste_for_dev(master);
 }
@@ -1731,6 +1745,7 @@ static void arm_smmu_detach_dev(struct arm_smmu_master 
*master)
 static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev)
 {
int ret = 0;
+   unsigned long flags;
struct iommu_fwspec *fwspec = dev_iommu_fwspec_get(dev);
struct arm_smmu_device *smmu;
struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
@@ -1764,6 +1779,10 @@ static int arm_smmu_attach_dev(struct iommu_domain 
*domain, struct device *dev)
 
master->domain = smmu_domain;
 
+   spin_lock_irqsave(_domain->devices_lock, flags);
+   list_add(>domain_head, _domain->devices);
+   spin_unlock_irqrestore(_domain->devices_lock, flags);
+
if (smmu_domain->stage == ARM_SMMU_DOMAIN_S1)
arm_smmu_write_ctx_desc(smmu, _domain->s1_cfg);
 
-- 
2.21.0

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v2 2/7] iommu/arm-smmu-v3: Rename arm_smmu_master_data to arm_smmu_master

2019-04-09 Thread Jean-Philippe Brucker
The arm_smmu_master_data structure already represents more than just the
firmware data associated to a master, and will be used extensively to
represent a device's state when implementing more SMMU features. Rename
the structure to arm_smmu_master.

Signed-off-by: Jean-Philippe Brucker 
---
 drivers/iommu/arm-smmu-v3.c | 12 ++--
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c
index d3880010c6cf..50cb037f3d8a 100644
--- a/drivers/iommu/arm-smmu-v3.c
+++ b/drivers/iommu/arm-smmu-v3.c
@@ -591,7 +591,7 @@ struct arm_smmu_device {
 };
 
 /* SMMU private data for each master */
-struct arm_smmu_master_data {
+struct arm_smmu_master {
struct arm_smmu_device  *smmu;
struct arm_smmu_strtab_ent  ste;
 };
@@ -1691,7 +1691,7 @@ static __le64 *arm_smmu_get_step_for_sid(struct 
arm_smmu_device *smmu, u32 sid)
 static void arm_smmu_install_ste_for_dev(struct iommu_fwspec *fwspec)
 {
int i, j;
-   struct arm_smmu_master_data *master = fwspec->iommu_priv;
+   struct arm_smmu_master *master = fwspec->iommu_priv;
struct arm_smmu_device *smmu = master->smmu;
 
for (i = 0; i < fwspec->num_ids; ++i) {
@@ -1712,7 +1712,7 @@ static void arm_smmu_install_ste_for_dev(struct 
iommu_fwspec *fwspec)
 static void arm_smmu_detach_dev(struct device *dev)
 {
struct iommu_fwspec *fwspec = dev_iommu_fwspec_get(dev);
-   struct arm_smmu_master_data *master = fwspec->iommu_priv;
+   struct arm_smmu_master *master = fwspec->iommu_priv;
 
master->ste.assigned = false;
arm_smmu_install_ste_for_dev(fwspec);
@@ -1724,7 +1724,7 @@ static int arm_smmu_attach_dev(struct iommu_domain 
*domain, struct device *dev)
struct iommu_fwspec *fwspec = dev_iommu_fwspec_get(dev);
struct arm_smmu_device *smmu;
struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
-   struct arm_smmu_master_data *master;
+   struct arm_smmu_master *master;
struct arm_smmu_strtab_ent *ste;
 
if (!fwspec)
@@ -1860,7 +1860,7 @@ static int arm_smmu_add_device(struct device *dev)
 {
int i, ret;
struct arm_smmu_device *smmu;
-   struct arm_smmu_master_data *master;
+   struct arm_smmu_master *master;
struct iommu_fwspec *fwspec = dev_iommu_fwspec_get(dev);
struct iommu_group *group;
 
@@ -1913,7 +1913,7 @@ static int arm_smmu_add_device(struct device *dev)
 static void arm_smmu_remove_device(struct device *dev)
 {
struct iommu_fwspec *fwspec = dev_iommu_fwspec_get(dev);
-   struct arm_smmu_master_data *master;
+   struct arm_smmu_master *master;
struct arm_smmu_device *smmu;
 
if (!fwspec || fwspec->ops != _smmu_ops)
-- 
2.21.0

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v2 7/7] iommu/arm-smmu-v3: Disable tagged pointers

2019-04-09 Thread Jean-Philippe Brucker
The ARM architecture has a "Top Byte Ignore" (TBI) option that makes the
MMU mask out bits [63:56] of an address, allowing a userspace application
to store data in its pointers. This option is incompatible with PCI ATS.

If TBI is enabled in the SMMU and userspace triggers DMA transactions on
tagged pointers, the endpoint might create ATC entries for addresses that
include a tag. Software would then have to send ATC invalidation packets
for each 255 possible alias of an address, or just wipe the whole address
space. This is not a viable option, so disable TBI.

The impact of this change is unclear, since there are very few users of
tagged pointers, much less SVA. But the requirement introduced by this
patch doesn't seem excessive: a userspace application using both tagged
pointers and SVA should now sanitize addresses (clear the tag) before
using them for device DMA.

Signed-off-by: Jean-Philippe Brucker 
---
 drivers/iommu/arm-smmu-v3.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c
index 7819cd60d08b..811dd7d83bf0 100644
--- a/drivers/iommu/arm-smmu-v3.c
+++ b/drivers/iommu/arm-smmu-v3.c
@@ -1061,7 +1061,6 @@ static u64 arm_smmu_cpu_tcr_to_cd(u64 tcr)
val |= ARM_SMMU_TCR2CD(tcr, EPD0);
val |= ARM_SMMU_TCR2CD(tcr, EPD1);
val |= ARM_SMMU_TCR2CD(tcr, IPS);
-   val |= ARM_SMMU_TCR2CD(tcr, TBI0);
 
return val;
 }
-- 
2.21.0

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v2 4/7] iommu/arm-smmu-v3: Add a master->domain pointer

2019-04-09 Thread Jean-Philippe Brucker
As we're going to track domain-master links more closely for ATS and CD
invalidation, add pointer to the attached domain in struct
arm_smmu_master. As a result, arm_smmu_strtab_ent is redundant and can be
removed.

Signed-off-by: Jean-Philippe Brucker 
---
 drivers/iommu/arm-smmu-v3.c | 92 ++---
 1 file changed, 45 insertions(+), 47 deletions(-)

diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c
index 25ba546cae7f..7b425483f4b6 100644
--- a/drivers/iommu/arm-smmu-v3.c
+++ b/drivers/iommu/arm-smmu-v3.c
@@ -505,19 +505,6 @@ struct arm_smmu_s2_cfg {
u64 vtcr;
 };
 
-struct arm_smmu_strtab_ent {
-   /*
-* An STE is "assigned" if the master emitting the corresponding SID
-* is attached to a domain. The behaviour of an unassigned STE is
-* determined by the disable_bypass parameter, whereas an assigned
-* STE behaves according to s1_cfg/s2_cfg, which themselves are
-* configured according to the domain type.
-*/
-   boolassigned;
-   struct arm_smmu_s1_cfg  *s1_cfg;
-   struct arm_smmu_s2_cfg  *s2_cfg;
-};
-
 struct arm_smmu_strtab_cfg {
__le64  *strtab;
dma_addr_t  strtab_dma;
@@ -593,7 +580,7 @@ struct arm_smmu_device {
 /* SMMU private data for each master */
 struct arm_smmu_master {
struct arm_smmu_device  *smmu;
-   struct arm_smmu_strtab_ent  ste;
+   struct arm_smmu_domain  *domain;
u32 *sids;
unsigned intnum_sids;
 };
@@ -1087,8 +1074,8 @@ static void arm_smmu_sync_ste_for_sid(struct 
arm_smmu_device *smmu, u32 sid)
arm_smmu_cmdq_issue_sync(smmu);
 }
 
-static void arm_smmu_write_strtab_ent(struct arm_smmu_device *smmu, u32 sid,
- __le64 *dst, struct arm_smmu_strtab_ent 
*ste)
+static void arm_smmu_write_strtab_ent(struct arm_smmu_master *master, u32 sid,
+ __le64 *dst)
 {
/*
 * This is hideously complicated, but we only really care about
@@ -1108,6 +1095,10 @@ static void arm_smmu_write_strtab_ent(struct 
arm_smmu_device *smmu, u32 sid,
 */
u64 val = le64_to_cpu(dst[0]);
bool ste_live = false;
+   struct arm_smmu_device *smmu = NULL;
+   struct arm_smmu_s1_cfg *s1_cfg = NULL;
+   struct arm_smmu_s2_cfg *s2_cfg = NULL;
+   struct arm_smmu_domain *smmu_domain = NULL;
struct arm_smmu_cmdq_ent prefetch_cmd = {
.opcode = CMDQ_OP_PREFETCH_CFG,
.prefetch   = {
@@ -1115,6 +1106,25 @@ static void arm_smmu_write_strtab_ent(struct 
arm_smmu_device *smmu, u32 sid,
},
};
 
+   if (master) {
+   smmu_domain = master->domain;
+   smmu = master->smmu;
+   }
+
+   if (smmu_domain) {
+   switch (smmu_domain->stage) {
+   case ARM_SMMU_DOMAIN_S1:
+   s1_cfg = _domain->s1_cfg;
+   break;
+   case ARM_SMMU_DOMAIN_S2:
+   case ARM_SMMU_DOMAIN_NESTED:
+   s2_cfg = _domain->s2_cfg;
+   break;
+   default:
+   break;
+   }
+   }
+
if (val & STRTAB_STE_0_V) {
switch (FIELD_GET(STRTAB_STE_0_CFG, val)) {
case STRTAB_STE_0_CFG_BYPASS:
@@ -1135,8 +1145,8 @@ static void arm_smmu_write_strtab_ent(struct 
arm_smmu_device *smmu, u32 sid,
val = STRTAB_STE_0_V;
 
/* Bypass/fault */
-   if (!ste->assigned || !(ste->s1_cfg || ste->s2_cfg)) {
-   if (!ste->assigned && disable_bypass)
+   if (!smmu_domain || !(s1_cfg || s2_cfg)) {
+   if (!smmu_domain && disable_bypass)
val |= FIELD_PREP(STRTAB_STE_0_CFG, 
STRTAB_STE_0_CFG_ABORT);
else
val |= FIELD_PREP(STRTAB_STE_0_CFG, 
STRTAB_STE_0_CFG_BYPASS);
@@ -1154,7 +1164,7 @@ static void arm_smmu_write_strtab_ent(struct 
arm_smmu_device *smmu, u32 sid,
return;
}
 
-   if (ste->s1_cfg) {
+   if (s1_cfg) {
BUG_ON(ste_live);
dst[1] = cpu_to_le64(
 FIELD_PREP(STRTAB_STE_1_S1CIR, 
STRTAB_STE_1_S1C_CACHE_WBRA) |
@@ -1169,22 +1179,22 @@ static void arm_smmu_write_strtab_ent(struct 
arm_smmu_device *smmu, u32 sid,
   !(smmu->features & ARM_SMMU_FEAT_STALL_FORCE))
dst[1] |= cpu_to_le64(STRTAB_STE_1_S1STALLD);
 
-   val |= (ste->s1_cfg->cdptr_dma & STRTAB_STE_0_S1CTXPTR_MASK) |
+   val |= (s1_cfg->cdptr_dma & STRTAB_STE_0_S1CTXPTR_MASK) |
FIELD_PREP(STRTAB_STE_0_CFG, STRTAB_STE_0_CFG_S1_TRANS);
}
 
-   if 

[PATCH v2 3/7] iommu/arm-smmu-v3: Store SteamIDs in master

2019-04-09 Thread Jean-Philippe Brucker
Simplify the attach/detach code a bit by keeping a pointer to the stream
IDs in the master structure. Although not completely obvious here, it does
make the subsequent support for ATS, PRI and PASID a bit simpler.

Signed-off-by: Jean-Philippe Brucker 
---
 drivers/iommu/arm-smmu-v3.c | 30 +++---
 1 file changed, 15 insertions(+), 15 deletions(-)

diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c
index 50cb037f3d8a..25ba546cae7f 100644
--- a/drivers/iommu/arm-smmu-v3.c
+++ b/drivers/iommu/arm-smmu-v3.c
@@ -594,6 +594,8 @@ struct arm_smmu_device {
 struct arm_smmu_master {
struct arm_smmu_device  *smmu;
struct arm_smmu_strtab_ent  ste;
+   u32 *sids;
+   unsigned intnum_sids;
 };
 
 /* SMMU private data for an IOMMU domain */
@@ -1688,19 +1690,18 @@ static __le64 *arm_smmu_get_step_for_sid(struct 
arm_smmu_device *smmu, u32 sid)
return step;
 }
 
-static void arm_smmu_install_ste_for_dev(struct iommu_fwspec *fwspec)
+static void arm_smmu_install_ste_for_dev(struct arm_smmu_master *master)
 {
int i, j;
-   struct arm_smmu_master *master = fwspec->iommu_priv;
struct arm_smmu_device *smmu = master->smmu;
 
-   for (i = 0; i < fwspec->num_ids; ++i) {
-   u32 sid = fwspec->ids[i];
+   for (i = 0; i < master->num_sids; ++i) {
+   u32 sid = master->sids[i];
__le64 *step = arm_smmu_get_step_for_sid(smmu, sid);
 
/* Bridged PCI devices may end up with duplicated IDs */
for (j = 0; j < i; j++)
-   if (fwspec->ids[j] == sid)
+   if (master->sids[j] == sid)
break;
if (j < i)
continue;
@@ -1709,13 +1710,10 @@ static void arm_smmu_install_ste_for_dev(struct 
iommu_fwspec *fwspec)
}
 }
 
-static void arm_smmu_detach_dev(struct device *dev)
+static void arm_smmu_detach_dev(struct arm_smmu_master *master)
 {
-   struct iommu_fwspec *fwspec = dev_iommu_fwspec_get(dev);
-   struct arm_smmu_master *master = fwspec->iommu_priv;
-
master->ste.assigned = false;
-   arm_smmu_install_ste_for_dev(fwspec);
+   arm_smmu_install_ste_for_dev(master);
 }
 
 static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev)
@@ -1736,7 +1734,7 @@ static int arm_smmu_attach_dev(struct iommu_domain 
*domain, struct device *dev)
 
/* Already attached to a different domain? */
if (ste->assigned)
-   arm_smmu_detach_dev(dev);
+   arm_smmu_detach_dev(master);
 
mutex_lock(_domain->init_mutex);
 
@@ -1770,7 +1768,7 @@ static int arm_smmu_attach_dev(struct iommu_domain 
*domain, struct device *dev)
ste->s2_cfg = _domain->s2_cfg;
}
 
-   arm_smmu_install_ste_for_dev(fwspec);
+   arm_smmu_install_ste_for_dev(master);
 out_unlock:
mutex_unlock(_domain->init_mutex);
return ret;
@@ -1883,12 +1881,14 @@ static int arm_smmu_add_device(struct device *dev)
return -ENOMEM;
 
master->smmu = smmu;
+   master->sids = fwspec->ids;
+   master->num_sids = fwspec->num_ids;
fwspec->iommu_priv = master;
}
 
/* Check the SIDs are in range of the SMMU and our stream table */
-   for (i = 0; i < fwspec->num_ids; i++) {
-   u32 sid = fwspec->ids[i];
+   for (i = 0; i < master->num_sids; i++) {
+   u32 sid = master->sids[i];
 
if (!arm_smmu_sid_in_range(smmu, sid))
return -ERANGE;
@@ -1922,7 +1922,7 @@ static void arm_smmu_remove_device(struct device *dev)
master = fwspec->iommu_priv;
smmu = master->smmu;
if (master && master->ste.assigned)
-   arm_smmu_detach_dev(dev);
+   arm_smmu_detach_dev(master);
iommu_group_remove_device(dev);
iommu_device_unlink(>iommu, dev);
kfree(master);
-- 
2.21.0

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v2 6/7] iommu/arm-smmu-v3: Add support for PCI ATS

2019-04-09 Thread Jean-Philippe Brucker
PCIe devices can implement their own TLB, named Address Translation Cache
(ATC). Enable Address Translation Service (ATS) for devices that support
it and send them invalidation requests whenever we invalidate the IOTLBs.

ATC invalidation is allowed to take up to 90 seconds, according to the
PCIe spec, so it is possible to get a SMMU command queue timeout during
normal operations. However we expect implementations to complete
invalidation in reasonable time.

We only enable ATS for "trusted" devices, and currently rely on the
pci_dev->untrusted bit. For ATS we have to trust that:

(a) The device doesn't issue "translated" memory requests for addresses
that weren't returned by the SMMU in a Translation Completion. In
particular, if we give control of a device or device partition to a VM
or userspace, software cannot program the device to access arbitrary
"translated" addresses.

(b) The device follows permissions granted by the SMMU in a Translation
Completion. If the device requested read+write permission and only
got read, then it doesn't write.

(c) The device doesn't send Translated transactions for an address that
was invalidated by an ATC invalidation.

Note that the PCIe specification explicitly requires all of these, so we
can assume that implementations will cleanly shield ATCs from software.

All ATS translated requests still go through the SMMU, to walk the stream
table and check that the device is actually allowed to send translated
requests.

Signed-off-by: Jean-Philippe Brucker 
---
 drivers/iommu/arm-smmu-v3.c | 196 +++-
 1 file changed, 191 insertions(+), 5 deletions(-)

diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c
index 3e7198ee9530..7819cd60d08b 100644
--- a/drivers/iommu/arm-smmu-v3.c
+++ b/drivers/iommu/arm-smmu-v3.c
@@ -29,6 +29,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 
 #include 
@@ -86,6 +87,7 @@
 #define IDR5_VAX_52_BIT1
 
 #define ARM_SMMU_CR0   0x20
+#define CR0_ATSCHK (1 << 4)
 #define CR0_CMDQEN (1 << 3)
 #define CR0_EVTQEN (1 << 2)
 #define CR0_PRIQEN (1 << 1)
@@ -294,6 +296,7 @@
 #define CMDQ_ERR_CERROR_NONE_IDX   0
 #define CMDQ_ERR_CERROR_ILL_IDX1
 #define CMDQ_ERR_CERROR_ABT_IDX2
+#define CMDQ_ERR_CERROR_ATC_INV_IDX3
 
 #define CMDQ_0_OP  GENMASK_ULL(7, 0)
 #define CMDQ_0_SSV (1UL << 11)
@@ -312,6 +315,12 @@
 #define CMDQ_TLBI_1_VA_MASKGENMASK_ULL(63, 12)
 #define CMDQ_TLBI_1_IPA_MASK   GENMASK_ULL(51, 12)
 
+#define CMDQ_ATC_0_SSIDGENMASK_ULL(31, 12)
+#define CMDQ_ATC_0_SID GENMASK_ULL(63, 32)
+#define CMDQ_ATC_0_GLOBAL  (1UL << 9)
+#define CMDQ_ATC_1_SIZEGENMASK_ULL(5, 0)
+#define CMDQ_ATC_1_ADDR_MASK   GENMASK_ULL(63, 12)
+
 #define CMDQ_PRI_0_SSIDGENMASK_ULL(31, 12)
 #define CMDQ_PRI_0_SID GENMASK_ULL(63, 32)
 #define CMDQ_PRI_1_GRPID   GENMASK_ULL(8, 0)
@@ -365,6 +374,11 @@ module_param_named(disable_bypass, disable_bypass, bool, 
S_IRUGO);
 MODULE_PARM_DESC(disable_bypass,
"Disable bypass streams such that incoming transactions from devices 
that are not attached to an iommu domain will report an abort back to the 
device and will not be allowed to pass through the SMMU.");
 
+static bool disable_ats_check;
+module_param_named(disable_ats_check, disable_ats_check, bool, S_IRUGO);
+MODULE_PARM_DESC(disable_ats_check,
+   "By default, the SMMU checks whether each incoming transaction marked 
as translated is allowed by the stream configuration. This option disables the 
check.");
+
 enum pri_resp {
PRI_RESP_DENY = 0,
PRI_RESP_FAIL = 1,
@@ -433,6 +447,16 @@ struct arm_smmu_cmdq_ent {
u64 addr;
} tlbi;
 
+   #define CMDQ_OP_ATC_INV 0x40
+   #define ATC_INV_SIZE_ALL52
+   struct {
+   u32 sid;
+   u32 ssid;
+   u64 addr;
+   u8  size;
+   boolglobal;
+   } atc;
+
#define CMDQ_OP_PRI_RESP0x41
struct {
u32 sid;
@@ -580,10 +604,12 @@ struct arm_smmu_device {
 /* SMMU private data for each master */
 struct arm_smmu_master {
struct arm_smmu_device  *smmu;
+   struct device   *dev;
struct arm_smmu_domain  *domain;
struct list_headdomain_head;
u32 *sids;
unsigned int   

[PATCH v2 0/7] Add PCI ATS support to Arm SMMUv3

2019-04-09 Thread Jean-Philippe Brucker
This series enables PCI ATS in SMMUv3. Changes since v1 [1]:

* Simplify the SMMU structures (patches 2-4 are new).

* Don't enable ATS for devices that are attached to a bypass domain,
  because in that case a translation request would cause F_BAD_ATS_TREQ.
  Translation requests in that case cause F_BAD_ATS_TREQ. Enable ATS in
  attach_dev() rather than add_device().

* Enable ATS for stage-1 and stage-2 STEs alike. There is no benefit to
  disabling ATS for stage-2 domains.

[1] https://www.spinics.net/lists/arm-kernel/msg714628.html

Jean-Philippe Brucker (7):
  ACPI/IORT: Check ATS capability in root complex nodes
  iommu/arm-smmu-v3: Rename arm_smmu_master_data to arm_smmu_master
  iommu/arm-smmu-v3: Store SteamIDs in master
  iommu/arm-smmu-v3: Add a master->domain pointer
  iommu/arm-smmu-v3: Link domains and devices
  iommu/arm-smmu-v3: Add support for PCI ATS
  iommu/arm-smmu-v3: Disable tagged pointers

 drivers/acpi/arm64/iort.c   |  11 ++
 drivers/iommu/arm-smmu-v3.c | 340 
 include/linux/iommu.h   |   4 +
 3 files changed, 286 insertions(+), 69 deletions(-)

-- 
2.21.0

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v2 1/7] ACPI/IORT: Check ATS capability in root complex nodes

2019-04-09 Thread Jean-Philippe Brucker
Root complex node in IORT has a bit telling whether it supports ATS or
not. Store this bit in the IOMMU fwspec when setting up a device, so it
can be accessed later by an IOMMU driver.

Use the negative version (NO_ATS) at the moment because it's not clear
if/how the bit needs to be integrated in other firmware descriptions. The
SMMU has a feature bit telling if it supports ATS, which might be
sufficient in most systems for deciding whether or not we should enable
the ATS capability in endpoints.

Signed-off-by: Jean-Philippe Brucker 
---
 drivers/acpi/arm64/iort.c | 11 +++
 include/linux/iommu.h |  4 
 2 files changed, 15 insertions(+)

diff --git a/drivers/acpi/arm64/iort.c b/drivers/acpi/arm64/iort.c
index e48894e002ba..7f2c1c9c6b38 100644
--- a/drivers/acpi/arm64/iort.c
+++ b/drivers/acpi/arm64/iort.c
@@ -1028,6 +1028,14 @@ void iort_dma_setup(struct device *dev, u64 *dma_addr, 
u64 *dma_size)
dev_dbg(dev, "dma_pfn_offset(%#08llx)\n", offset);
 }
 
+static bool iort_pci_rc_supports_ats(struct acpi_iort_node *node)
+{
+   struct acpi_iort_root_complex *pci_rc;
+
+   pci_rc = (struct acpi_iort_root_complex *)node->node_data;
+   return pci_rc->ats_attribute & ACPI_IORT_ATS_SUPPORTED;
+}
+
 /**
  * iort_iommu_configure - Set-up IOMMU configuration for a device.
  *
@@ -1063,6 +1071,9 @@ const struct iommu_ops *iort_iommu_configure(struct 
device *dev)
info.node = node;
err = pci_for_each_dma_alias(to_pci_dev(dev),
 iort_pci_iommu_init, );
+
+   if (!err && !iort_pci_rc_supports_ats(node))
+   dev->iommu_fwspec->flags |= IOMMU_FWSPEC_PCI_NO_ATS;
} else {
int i = 0;
 
diff --git a/include/linux/iommu.h b/include/linux/iommu.h
index 3dbeb457fb16..ed6738c358ca 100644
--- a/include/linux/iommu.h
+++ b/include/linux/iommu.h
@@ -509,10 +509,14 @@ struct iommu_fwspec {
const struct iommu_ops  *ops;
struct fwnode_handle*iommu_fwnode;
void*iommu_priv;
+   u32 flags;
unsigned intnum_ids;
u32 ids[1];
 };
 
+/* Firmware disabled ATS in the root complex */
+#define IOMMU_FWSPEC_PCI_NO_ATS(1 << 0)
+
 int iommu_fwspec_init(struct device *dev, struct fwnode_handle *iommu_fwnode,
  const struct iommu_ops *ops);
 void iommu_fwspec_free(struct device *dev);
-- 
2.21.0

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v2 2/3] drm: Add a drm_gem_objects_lookup helper

2019-04-09 Thread Eric Anholt
Rob Herring  writes:

> On Mon, Apr 1, 2019 at 10:43 AM Eric Anholt  wrote:
>>
>> Chris Wilson  writes:
>>
>> > Quoting Daniel Vetter (2019-04-01 14:06:48)
>> >> On Mon, Apr 1, 2019 at 9:47 AM Rob Herring  wrote:
>> >> > +{
>> >> > +   int i, ret = 0;
>> >> > +   struct drm_gem_object *obj;
>> >> > +
>> >> > +   spin_lock(>table_lock);
>> >> > +
>> >> > +   for (i = 0; i < count; i++) {
>> >> > +   /* Check if we currently have a reference on the object 
>> >> > */
>> >> > +   obj = idr_find(>object_idr, handle[i]);
>> >> > +   if (!obj) {
>> >> > +   ret = -ENOENT;
>> >
>> > Unwind previous drm_gem_object_get(), the caller has no idea how many
>> > were processed before the error.
>>
>> I had the same thought, but the pattern we have is that you're loading
>> into a refcounted struct that will free the BOs when you're done,
>> anyway.
>
> The real bug here is if allocation of the array fails. The BO array
> may be NULL when the count is not. So this V3D cleanup hunk:
>
> for (i = 0; i < exec->bo_count; i++)
>   drm_gem_object_put_unlocked(>bo[i]->base.base);
>   kvfree(exec->bo);
>
> Needs to be wrapped with 'if (exec->bo)'. We have a similar problem
> with fence arrays too.

Yeah, seems legit.  Not really going to write new patches when I've got
month-old critical patches I can't get acked, though. :/


signature.asc
Description: PGP signature
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH 05/18] iommu: introduce device fault data

2019-04-09 Thread Jacob Pan
On Tue, 9 Apr 2019 13:03:15 +0300
Andriy Shevchenko  wrote:

> On Mon, Apr 08, 2019 at 04:59:20PM -0700, Jacob Pan wrote:
> > Device faults detected by IOMMU can be reported outside the IOMMU
> > subsystem for further processing. This patch introduces
> > a generic device fault data structure.
> > 
> > The fault can be either an unrecoverable fault or a page request,
> > also referred to as a recoverable fault.
> > 
> > We only care about non internal faults that are likely to be
> > reported to an external subsystem.
> > 
> > Signed-off-by: Jacob Pan 
> > Signed-off-by: Jean-Philippe Brucker 
> > Signed-off-by: Liu, Yi L 
> > Signed-off-by: Ashok Raj 
> > Signed-off-by: Eric Auger   
> 
> JFYI: we have a Co-developed-by tag as well for better granularity of
> what certain person/people did.
> 
Good to know, thank!
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 08/18] iommu: Introduce cache_invalidate API

2019-04-09 Thread Jacob Pan
On Tue, 9 Apr 2019 13:07:18 +0300
Andriy Shevchenko  wrote:

> On Mon, Apr 08, 2019 at 04:59:23PM -0700, Jacob Pan wrote:
> > From: "Liu, Yi L" 
> > 
> > In any virtualization use case, when the first translation stage
> > is "owned" by the guest OS, the host IOMMU driver has no knowledge
> > of caching structure updates unless the guest invalidation
> > activities are trapped by the virtualizer and passed down to the
> > host.
> > 
> > Since the invalidation data are obtained from user space and will be
> > written into physical IOMMU, we must allow security check at various
> > layers. Therefore, generic invalidation data format are proposed
> > here, model specific IOMMU drivers need to convert them into their
> > own format.  
> 
> > +int iommu_cache_invalidate(struct iommu_domain *domain, struct
> > device *dev,
> > +  struct iommu_cache_invalidate_info
> > *inv_info) +{
> > +   int ret = 0;  
> 
> Redundant assignment.
> 
I am not a security expert but initialization of local variable can be
more secure.
I was looking at this talk.
https://outflux.net/slides/2018/lss/danger.pdf
https://cwe.mitre.org/data/definitions/457.html

> > +
> > +   if (unlikely(!domain->ops->cache_invalidate))
> > +   return -ENODEV;
> > +
> > +   ret = domain->ops->cache_invalidate(domain, dev, inv_info);
> > +
> > +   return ret;
> > +}  
> 

[Jacob Pan]
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 11/21] dma-iommu: refactor page array remap helpers

2019-04-09 Thread Robin Murphy

On 27/03/2019 08:04, Christoph Hellwig wrote:

Move the call to dma_common_pages_remap / dma_common_free_remap  into
__iommu_dma_alloc / __iommu_dma_free and rename those functions to
better describe what they do.  This keeps the functionality that
allocates and remaps a non-contigous array of pages nicely abstracted
out from the calling code.

Signed-off-by: Christoph Hellwig 
---
  drivers/iommu/dma-iommu.c | 75 +++
  1 file changed, 36 insertions(+), 39 deletions(-)

diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
index 4d46beeea5b7..2013c650718a 100644
--- a/drivers/iommu/dma-iommu.c
+++ b/drivers/iommu/dma-iommu.c
@@ -524,51 +524,57 @@ static struct page **__iommu_dma_alloc_pages(struct 
device *dev,
  }
  
  /**

- * iommu_dma_free - Free a buffer allocated by __iommu_dma_alloc()
+ * iommu_dma_free_remap - Free a buffer allocated by iommu_dma_alloc_remap


Unmap and free a buffer allocated by iommu_dma_alloc_remap()


   * @dev: Device which owns this buffer
- * @pages: Array of buffer pages as returned by __iommu_dma_alloc()
   * @size: Size of buffer in bytes
+ * @cpu_address: Virtual address of the buffer
   * @handle: DMA address of buffer


@dma_handle


   *
   * Frees both the pages associated with the buffer, and the array
   * describing them


and removes the CPU mapping.


   */
-static void __iommu_dma_free(struct device *dev, struct page **pages,
-   size_t size, dma_addr_t *handle)
+static void iommu_dma_free_remap(struct device *dev, size_t size,
+   void *cpu_addr, dma_addr_t dma_handle)
  {
-   __iommu_dma_unmap(iommu_get_dma_domain(dev), *handle, size);
-   __iommu_dma_free_pages(pages, PAGE_ALIGN(size) >> PAGE_SHIFT);
-   *handle = DMA_MAPPING_ERROR;
+   struct vm_struct *area = find_vm_area(cpu_addr);
+
+   if (WARN_ON(!area || !area->pages))
+   return;
+   __iommu_dma_unmap(iommu_get_dma_domain(dev), dma_handle, size);
+   __iommu_dma_free_pages(area->pages, PAGE_ALIGN(size) >> PAGE_SHIFT);
+   dma_common_free_remap(cpu_addr, PAGE_ALIGN(size), VM_USERMAP);
  }
  
  /**

- * __iommu_dma_alloc - Allocate and map a buffer contiguous in IOVA space
+ * iommu_dma_alloc_remap - Allocate and map a buffer contiguous in IOVA space


I'm not sure of a succinct way to update that one :(

Other than kerneldoc nits, though,

Reviewed-by: Robin Murphy 


   * @dev: Device to allocate memory for. Must be a real device
   * attached to an iommu_dma_domain
   * @size: Size of buffer in bytes
+ * @dma_handle: Out argument for allocated DMA handle
   * @gfp: Allocation flags
   * @attrs: DMA attributes for this allocation
- * @prot: IOMMU mapping flags
- * @handle: Out argument for allocated DMA handle
   *
   * If @size is less than PAGE_SIZE, then a full CPU page will be allocated,
   * but an IOMMU which supports smaller pages might not map the whole thing.
   *
- * Return: Array of struct page pointers describing the buffer,
- *or NULL on failure.
+ * Return: Mapped virtual address, or NULL on failure.
   */
-static struct page **__iommu_dma_alloc(struct device *dev, size_t size,
-   gfp_t gfp, unsigned long attrs, int prot, dma_addr_t *handle)
+static void *iommu_dma_alloc_remap(struct device *dev, size_t size,
+   dma_addr_t *dma_handle, gfp_t gfp, unsigned long attrs)
  {
struct iommu_domain *domain = iommu_get_dma_domain(dev);
struct iommu_dma_cookie *cookie = domain->iova_cookie;
struct iova_domain *iovad = >iovad;
+   bool coherent = dev_is_dma_coherent(dev);
+   int ioprot = dma_info_to_prot(DMA_BIDIRECTIONAL, coherent, attrs);
+   pgprot_t prot = arch_dma_mmap_pgprot(dev, PAGE_KERNEL, attrs);
+   unsigned int count, min_size, alloc_sizes = domain->pgsize_bitmap;
struct page **pages;
struct sg_table sgt;
dma_addr_t iova;
-   unsigned int count, min_size, alloc_sizes = domain->pgsize_bitmap;
+   void *vaddr;
  
-	*handle = DMA_MAPPING_ERROR;

+   *dma_handle = DMA_MAPPING_ERROR;
  
  	min_size = alloc_sizes & -alloc_sizes;

if (min_size < PAGE_SIZE) {
@@ -594,7 +600,7 @@ static struct page **__iommu_dma_alloc(struct device *dev, 
size_t size,
if (sg_alloc_table_from_pages(, pages, count, 0, size, GFP_KERNEL))
goto out_free_iova;
  
-	if (!(prot & IOMMU_CACHE)) {

+   if (!(ioprot & IOMMU_CACHE)) {
struct scatterlist *sg;
int i;
  
@@ -602,14 +608,21 @@ static struct page **__iommu_dma_alloc(struct device *dev, size_t size,

arch_dma_prep_coherent(sg_page(sg), sg->length);
}
  
-	if (iommu_map_sg(domain, iova, sgt.sgl, sgt.orig_nents, prot)

+   if (iommu_map_sg(domain, iova, sgt.sgl, sgt.orig_nents, ioprot)
< size)
goto out_free_sg;
  
-	*handle = iova;

+   vaddr = dma_common_pages_remap(pages, size, 

Re: [PATCH 09/18] iommu/vt-d: Enlightened PASID allocation

2019-04-09 Thread Jacob Pan
On Tue, 9 Apr 2019 13:08:59 +0300
Andriy Shevchenko  wrote:

> On Mon, Apr 08, 2019 at 04:59:24PM -0700, Jacob Pan wrote:
> > From: Lu Baolu 
> > 
> > If Intel IOMMU runs in caching mode, a.k.a. virtual IOMMU, the
> > IOMMU driver should rely on the emulation software to allocate
> > and free PASID IDs. The Intel vt-d spec revision 3.0 defines a
> > register set to support this. This includes a capability register,
> > a virtual command register and a virtual response register. Refer
> > to section 10.4.42, 10.4.43, 10.4.44 for more information.
> > 
> > This patch adds the enlightened PASID allocation/free interfaces
> > via the virtual command register.  
> 
> > +   pr_debug("vcmd alloc pasid\n");  
> 
> Perhaps tracepoints should be in use?
> OTOH we have function tracer, so, this message in any case is a noise.
> And this is applicable to other similar cases.
> 

Agreed, I added that for debug purpose.

Thanks,

Jacob
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 00/18] Shared virtual address IOMMU and VT-d support

2019-04-09 Thread Jacob Pan
On Tue, 9 Apr 2019 12:56:23 +0300
Andriy Shevchenko  wrote:

> On Mon, Apr 08, 2019 at 04:59:15PM -0700, Jacob Pan wrote:
> > Shared virtual address (SVA), a.k.a, Shared virtual memory (SVM) on
> > Intel platforms allow address space sharing between device DMA and
> > applications. SVA can reduce programming complexity and enhance
> > security. This series is intended to enable SVA virtualization,
> > i.e. shared guest application address space and physical device DMA
> > address. Only IOMMU portion of the changes are included in this
> > series. Additional support is needed in VFIO and QEMU (will be
> > submitted separately) to complete this functionality.
> > 
> > To make incremental changes and reduce the size of each patchset.
> > This series does not inlcude support for page request services.
> > 
> > In VT-d implementation, PASID table is per device and maintained in
> > the host. Guest PASID table is shadowed in VMM where virtual IOMMU
> > is emulated.  
> 
> This seems missed the comments I gave you internally.
> 
I didn't include that since the code you commented on is from Jean and
I don't have a strong opinion on either way. I thought I explained that.
Sorry for the miscommunication.

Thanks for your review,

Jacob
> > 
> > .-.  .---.
> > |   vIOMMU|  | Guest process CR3, FL only|
> > | |  '---'
> > ./
> > | PASID Entry |--- PASID cache flush -
> > '-'   |
> > | |   V
> > | |CR3 in GPA
> > '-'
> > Guest
> > --| Shadow |--|
> >   vv  v
> > Host
> > .-.  .--.
> > |   pIOMMU|  | Bind FL for GVA-GPA  |
> > | |  '--'
> > ./  |
> > | PASID Entry | V (Nested xlate)
> > '\.--.
> > | |   |SL for GPA-HPA, default domain|
> > | |   '--'
> > '-'
> > Where:
> >  - FL = First level/stage one page tables
> >  - SL = Second level/stage two page tables
> > 
> > 
> > This work is based on collaboration with other developers on the
> > IOMMU mailing list. Notably,
> > 
> > [1] [PATCH v6 00/22] SMMUv3 Nested Stage Setup by Eric Auger
> > https://lkml.org/lkml/2019/3/17/124
> > 
> > [2] [RFC PATCH 2/6] drivers core: Add I/O ASID allocator by
> > Jean-Philippe Brucker
> > https://www.spinics.net/lists/iommu/msg30639.html
> > 
> > [3] [RFC PATCH 0/5] iommu: APIs for paravirtual PASID allocation by
> > Lu Baolu https://lkml.org/lkml/2018/11/12/1921
> > 
> > There are roughly three parts:
> > 1. Generic PASID allocator [1] with extension to support custom
> > allocator 2. IOMMU cache invalidation passdown from guest to host
> > 3. Guest PASID bind for nested translation
> > 
> > All generic IOMMU APIs are reused from [1], which has a v7 just
> > published with no real impact to the patches used here. It is worth
> > noting that unlike sMMU nested stage setup, where PASID table is
> > owned by the guest, VT-d PASID table is owned by the host,
> > individual PASIDs are bound instead of the PASID table.
> > 
> > 
> > Jacob Pan (15):
> >   ioasid: Add custom IOASID allocator
> >   ioasid: Convert ioasid_idr to XArray
> >   driver core: add per device iommu param
> >   iommu: introduce device fault data
> >   iommu: introduce device fault report API
> >   iommu: Introduce attach/detach_pasid_table API
> >   iommu/vt-d: Add custom allocator for IOASID
> >   iommu/vt-d: Replace Intel specific PASID allocator with IOASID
> >   iommu: Add guest PASID bind function
> >   iommu/vt-d: Move domain helper to header
> >   iommu/vt-d: Add nested translation support
> >   iommu/vt-d: Add bind guest PASID support
> >   iommu: add max num of cache and granu types
> >   iommu/vt-d: Support flushing more translation cache types
> >   iommu/vt-d: Add svm/sva invalidate function
> > 
> > Jean-Philippe Brucker (1):
> >   drivers core: Add I/O ASID allocator
> > 
> > Liu, Yi L (1):
> >   iommu: Introduce cache_invalidate API
> > 
> > Lu Baolu (1):
> >   iommu/vt-d: Enlightened PASID allocation
> > 
> >  drivers/base/Kconfig|   7 ++
> >  drivers/base/Makefile   |   1 +
> >  drivers/base/ioasid.c   | 211
> > + drivers/iommu/Kconfig
> > |   1 + drivers/iommu/dmar.c|  48 +
> >  drivers/iommu/intel-iommu.c | 219
> > -- drivers/iommu/intel-pasid.c
> > | 191 +-
> > drivers/iommu/intel-pasid.h |  24 - drivers/iommu/intel-svm.c
> > | 217 +++---
> > drivers/iommu/iommu.c   | 207
> > +++- include/linux/device.h
> > |   3 + 

Re: [PATCH v2 3/3] drm/panfrost: Add initial panfrost driver

2019-04-09 Thread Rob Herring
On Tue, Apr 9, 2019 at 10:56 AM Tomeu Vizoso  wrote:
>
> On Mon, 8 Apr 2019 at 23:04, Rob Herring  wrote:
> >
> > On Fri, Apr 5, 2019 at 7:30 AM Steven Price  wrote:
> > >
> > > On 01/04/2019 08:47, Rob Herring wrote:
> > > > This adds the initial driver for panfrost which supports Arm Mali
> > > > Midgard and Bifrost family of GPUs. Currently, only the T860 and
> > > > T760 Midgard GPUs have been tested.
> >
> > [...]
> > > > +
> > > > + if (status & JOB_INT_MASK_ERR(j)) {
> > > > + job_write(pfdev, JS_COMMAND_NEXT(j), 
> > > > JS_COMMAND_NOP);
> > > > + job_write(pfdev, JS_COMMAND(j), 
> > > > JS_COMMAND_HARD_STOP_0);
> > >
> > > Hard-stopping an already completed job isn't likely to do very much :)
> > > Also you are using the "_0" version which is only valid when "job chain
> > > disambiguation" is present.
>
> Yeah, guess that can be removed.

Steven, TBC, are you suggesting removing both lines or leaving
JS_COMMAND_NOP? I don't think we can ever have a pending job at this
point as there's no queuing. So the NOP probably isn't needed, but
doesn't hurt to have it either.

> > > I suspect in this case you should also be signalling the fence? At the
> > > moment you rely on the GPU timeout recovering from the fault.
> >
> > I'll defer to Tomeu who wrote this (IIRC).
>
> Yes, that would be an improvement.

Actually, I think that would break recovery because the job timeout
will bail out if the done fence is signaled already. Perhaps we want
to timeout immediately if that is possible with the scheduler.

Rob
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 01/18] drivers core: Add I/O ASID allocator

2019-04-09 Thread Paul E. McKenney
On Tue, Apr 09, 2019 at 01:30:30PM +0300, Andriy Shevchenko wrote:
> On Tue, Apr 09, 2019 at 03:04:36AM -0700, Christoph Hellwig wrote:
> > On Tue, Apr 09, 2019 at 01:00:49PM +0300, Andriy Shevchenko wrote:
> > > I think it makes sense to add a helper macro to rcupdate.h
> > > (and we have several cases in kernel that can utilize it)
> > > 
> > > #define kfree_non_null_rcu(ptr, rcu_head) \
> > >   do {\
> > >   if (ptr)\
> > >   kfree_rcu(ptr, rcu_head);   \
> > >   } while (0)
> > > 
> > > as a more common pattern for resource deallocators.
> > 
> > I think that should move straight into kfree_rcu.  
> 
> Possible. I didn't dare to offer this due to lack of knowledge how it's used 
> in
> other places.
> 
> > In general
> > we expect *free* to deal with NULL pointers transparently, so we
> > should do so here as well.
> 
> Exactly my point, thanks.

As shown below?

And now that you mention it, it is a bit surprising that no one has
complained before.  ;-)

Thanx, Paul



commit 23ad938244968e9d2a8001a1c52887c113b182f6
Author: Paul E. McKenney 
Date:   Tue Apr 9 07:48:18 2019 -0700

rcu: Make kfree_rcu() ignore NULL pointers

This commit makes the kfree_rcu() macro's semantics be consistent
with the likes of kfree() by adding a check for NULL pointers, so
that kfree_rcu(NULL, ...) is a no-op.

Reported-by: Andriy Shevchenko 
Reported-by: Christoph Hellwig 
Signed-off-by: Paul E. McKenney 

diff --git a/include/linux/rcupdate.h b/include/linux/rcupdate.h
index 922bb6848813..c68649b9bcec 100644
--- a/include/linux/rcupdate.h
+++ b/include/linux/rcupdate.h
@@ -828,9 +828,13 @@ static inline notrace void 
rcu_read_unlock_sched_notrace(void)
  * The BUILD_BUG_ON check must not involve any function calls, hence the
  * checks are done in macros here.
  */
-#define kfree_rcu(ptr, rcu_head)   \
-   __kfree_rcu(&((ptr)->rcu_head), offsetof(typeof(*(ptr)), rcu_head))
-
+#define kfree_rcu(ptr, rhf)\
+do {   \
+   typeof (ptr) ___p = (ptr);  \
+   \
+   if (___p)   \
+   __kfree_rcu(&((___p)->rhf), offsetof(typeof(*(ptr)), rhf)); \
+} while (0)
 
 /*
  * Place this after a lock-acquisition primitive to guarantee that

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v2 3/3] drm/panfrost: Add initial panfrost driver

2019-04-09 Thread Tomeu Vizoso
On Mon, 8 Apr 2019 at 23:04, Rob Herring  wrote:
>
> On Fri, Apr 5, 2019 at 7:30 AM Steven Price  wrote:
> >
> > On 01/04/2019 08:47, Rob Herring wrote:
> > > This adds the initial driver for panfrost which supports Arm Mali
> > > Midgard and Bifrost family of GPUs. Currently, only the T860 and
> > > T760 Midgard GPUs have been tested.
>
> [...]
> > > +
> > > + if (status & JOB_INT_MASK_ERR(j)) {
> > > + job_write(pfdev, JS_COMMAND_NEXT(j), 
> > > JS_COMMAND_NOP);
> > > + job_write(pfdev, JS_COMMAND(j), 
> > > JS_COMMAND_HARD_STOP_0);
> >
> > Hard-stopping an already completed job isn't likely to do very much :)
> > Also you are using the "_0" version which is only valid when "job chain
> > disambiguation" is present.

Yeah, guess that can be removed.

> > I suspect in this case you should also be signalling the fence? At the
> > moment you rely on the GPU timeout recovering from the fault.
>
> I'll defer to Tomeu who wrote this (IIRC).

Yes, that would be an improvement.

> > One issue that I haven't got to the bottom of is that I can trigger a
> > lockdep splat:
> >
> > -8<--
> > panfrost ffa3.gpu: js fault, js=1, status=JOB_CONFIG_FAULT,
> > head=0x0, tail=0x0
> > root@debian:~/ddk_panfrost# panfrost ffa3.gpu: gpu sched timeout,
> > js=1, status=0x40, head=0x0, tail=0x0, sched_job=12a94ba6
> >
> > ==
> > WARNING: possible circular locking dependency detected
> > 5.0.0+ #32 Not tainted
> > --
> > kworker/1:0/608 is trying to acquire lock:
> > 89b1e2d8 (&(>job_lock)->rlock){-.-.}, at:
> > dma_fence_remove_callback+0x14/0x50
> >
> > but task is already holding lock:
> > a887e4b2 (&(>job_list_lock)->rlock){-.-.}, at:
> > drm_sched_stop+0x24/0x10c
> >
> > which lock already depends on the new lock.
> >
> >
> > the existing dependency chain (in reverse order) is:
> >
> > -> #1 (&(>job_list_lock)->rlock){-.-.}:
> >drm_sched_process_job+0x60/0x208
> >dma_fence_signal+0x1dc/0x1fc
> >panfrost_job_irq_handler+0x160/0x194
> >__handle_irq_event_percpu+0x80/0x388
> >handle_irq_event_percpu+0x24/0x78
> >handle_irq_event+0x38/0x5c
> >handle_fasteoi_irq+0xb4/0x128
> >generic_handle_irq+0x18/0x28
> >__handle_domain_irq+0xa0/0xb4
> >gic_handle_irq+0x4c/0x78
> >__irq_svc+0x70/0x98
> >arch_cpu_idle+0x20/0x3c
> >arch_cpu_idle+0x20/0x3c
> >do_idle+0x11c/0x22c
> >cpu_startup_entry+0x18/0x20
> >start_kernel+0x398/0x420
> >
> > -> #0 (&(>job_lock)->rlock){-.-.}:
> >_raw_spin_lock_irqsave+0x50/0x64
> >dma_fence_remove_callback+0x14/0x50
> >drm_sched_stop+0x5c/0x10c
> >panfrost_job_timedout+0xd0/0x180
> >drm_sched_job_timedout+0x34/0x5c
> >process_one_work+0x2ac/0x6a4
> >worker_thread+0x28c/0x3fc
> >kthread+0x13c/0x158
> >ret_from_fork+0x14/0x20
> >  (null)
> >
> > other info that might help us debug this:
> >
> >  Possible unsafe locking scenario:
> >
> >CPU0CPU1
> >
> >   lock(&(>job_list_lock)->rlock);
> >lock(&(>job_lock)->rlock);
> >lock(&(>job_list_lock)->rlock);
> >   lock(&(>job_lock)->rlock);
> >
> >  *** DEADLOCK ***
> >
> > 3 locks held by kworker/1:0/608:
> >  #0: 9b350627 ((wq_completion)"events"){+.+.}, at:
> > process_one_work+0x1f8/0x6a4
> >  #1: a802aa2d ((work_completion)(&(>work_tdr)->work)){+.+.}, at:
> > process_one_work+0x1f8/0x6a4
> >  #2: a887e4b2 (&(>job_list_lock)->rlock){-.-.}, at:
> > drm_sched_stop+0x24/0x10c
> >
> > stack backtrace:
> > CPU: 1 PID: 608 Comm: kworker/1:0 Not tainted 5.0.0+ #32
> > Hardware name: Rockchip (Device Tree)
> > Workqueue: events drm_sched_job_timedout
> > [] (unwind_backtrace) from [] (show_stack+0x10/0x14)
> > [] (show_stack) from [] (dump_stack+0x9c/0xd4)
> > [] (dump_stack) from []
> > (print_circular_bug.constprop.15+0x1fc/0x2cc)
> > [] (print_circular_bug.constprop.15) from []
> > (__lock_acquire+0xe5c/0x167c)
> > [] (__lock_acquire) from [] (lock_acquire+0xc4/0x210)
> > [] (lock_acquire) from []
> > (_raw_spin_lock_irqsave+0x50/0x64)
> > [] (_raw_spin_lock_irqsave) from []
> > (dma_fence_remove_callback+0x14/0x50)
> > [] (dma_fence_remove_callback) from []
> > (drm_sched_stop+0x5c/0x10c)
> > [] (drm_sched_stop) from []
> > (panfrost_job_timedout+0xd0/0x180)
> > [] (panfrost_job_timedout) from []
> > (drm_sched_job_timedout+0x34/0x5c)
> > [] (drm_sched_job_timedout) from []
> > (process_one_work+0x2ac/0x6a4)
> > [] (process_one_work) from []
> > (worker_thread+0x28c/0x3fc)
> > [] (worker_thread) from [] (kthread+0x13c/0x158)
> > [] (kthread) from [] (ret_from_fork+0x14/0x20)
> > Exception stack(0xeebd7fb0 to 0xeebd7ff8)
> > 7fa0: 

Re: [PATCH 10/21] dma-iommu: move __iommu_dma_map

2019-04-09 Thread Robin Murphy

On 27/03/2019 08:04, Christoph Hellwig wrote:

Moving this function up to its unmap counterpart helps to keep related
code together for the following changes.


Reviewed-by: Robin Murphy 


Signed-off-by: Christoph Hellwig 
---
  drivers/iommu/dma-iommu.c | 46 +++
  1 file changed, 23 insertions(+), 23 deletions(-)

diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
index 57f2d8621112..4d46beeea5b7 100644
--- a/drivers/iommu/dma-iommu.c
+++ b/drivers/iommu/dma-iommu.c
@@ -435,6 +435,29 @@ static void __iommu_dma_unmap(struct iommu_domain *domain, 
dma_addr_t dma_addr,
iommu_dma_free_iova(cookie, dma_addr, size);
  }
  
+static dma_addr_t __iommu_dma_map(struct device *dev, phys_addr_t phys,

+   size_t size, int prot, struct iommu_domain *domain)
+{
+   struct iommu_dma_cookie *cookie = domain->iova_cookie;
+   size_t iova_off = 0;
+   dma_addr_t iova;
+
+   if (cookie->type == IOMMU_DMA_IOVA_COOKIE) {
+   iova_off = iova_offset(>iovad, phys);
+   size = iova_align(>iovad, size + iova_off);
+   }
+
+   iova = iommu_dma_alloc_iova(domain, size, dma_get_mask(dev), dev);
+   if (!iova)
+   return DMA_MAPPING_ERROR;
+
+   if (iommu_map(domain, iova, phys - iova_off, size, prot)) {
+   iommu_dma_free_iova(cookie, iova, size);
+   return DMA_MAPPING_ERROR;
+   }
+   return iova + iova_off;
+}
+
  static void __iommu_dma_free_pages(struct page **pages, int count)
  {
while (count--)
@@ -689,29 +712,6 @@ static void iommu_dma_sync_sg_for_device(struct device 
*dev,
arch_sync_dma_for_device(dev, sg_phys(sg), sg->length, dir);
  }
  
-static dma_addr_t __iommu_dma_map(struct device *dev, phys_addr_t phys,

-   size_t size, int prot, struct iommu_domain *domain)
-{
-   struct iommu_dma_cookie *cookie = domain->iova_cookie;
-   size_t iova_off = 0;
-   dma_addr_t iova;
-
-   if (cookie->type == IOMMU_DMA_IOVA_COOKIE) {
-   iova_off = iova_offset(>iovad, phys);
-   size = iova_align(>iovad, size + iova_off);
-   }
-
-   iova = iommu_dma_alloc_iova(domain, size, dma_get_mask(dev), dev);
-   if (!iova)
-   return DMA_MAPPING_ERROR;
-
-   if (iommu_map(domain, iova, phys - iova_off, size, prot)) {
-   iommu_dma_free_iova(cookie, iova, size);
-   return DMA_MAPPING_ERROR;
-   }
-   return iova + iova_off;
-}
-
  static dma_addr_t __iommu_dma_map_page(struct device *dev, struct page *page,
unsigned long offset, size_t size, int prot)
  {


___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 09/21] dma-iommu: refactor iommu_dma_get_sgtable

2019-04-09 Thread Robin Murphy

On 27/03/2019 08:04, Christoph Hellwig wrote:

Move the vm_area handling into a new iommu_dma_get_sgtable_remap helper.

Inline __iommu_dma_get_sgtable_page into the main function to simplify
the code flow a bit.

Signed-off-by: Christoph Hellwig 
---
  drivers/iommu/dma-iommu.c | 54 +--
  1 file changed, 24 insertions(+), 30 deletions(-)

diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
index 43bd3c7e0f6b..57f2d8621112 100644
--- a/drivers/iommu/dma-iommu.c
+++ b/drivers/iommu/dma-iommu.c
@@ -625,6 +625,18 @@ static int iommu_dma_mmap_remap(void *cpu_addr, size_t 
size,
return ret;
  }
  
+static int iommu_dma_get_sgtable_remap(struct sg_table *sgt, void *cpu_addr,

+   size_t size)
+{
+   unsigned int count = PAGE_ALIGN(size) >> PAGE_SHIFT;
+   struct vm_struct *area = find_vm_area(cpu_addr);
+
+   if (WARN_ON(!area || !area->pages))
+   return -ENXIO;
+   return sg_alloc_table_from_pages(sgt, area->pages, count, 0, size,
+   GFP_KERNEL);
+}
+


Is this complex enough to deserve being broken out? Really I'd prefer to 
keep get_sgtable() as small and consolidated as possible so that it's 
that much easier to delete in future :)


I guess there is a certain symmetry with mmap(), so if that's the angle 
you're dead set on, could we at least keep this guy down where 
__iommu_dma_get_sgtable_page() was?


Robin.


  static void iommu_dma_sync_single_for_cpu(struct device *dev,
dma_addr_t dma_handle, size_t size, enum dma_data_direction dir)
  {
@@ -1084,42 +1096,24 @@ static int iommu_dma_mmap(struct device *dev, struct 
vm_area_struct *vma,
vma_pages(vma) << PAGE_SHIFT, vma->vm_page_prot);
  }
  
-static int __iommu_dma_get_sgtable_page(struct sg_table *sgt, struct page *page,

-   size_t size)
-{
-   int ret = sg_alloc_table(sgt, 1, GFP_KERNEL);
-
-   if (!ret)
-   sg_set_page(sgt->sgl, page, PAGE_ALIGN(size), 0);
-   return ret;
-}
-
  static int iommu_dma_get_sgtable(struct device *dev, struct sg_table *sgt,
void *cpu_addr, dma_addr_t dma_addr, size_t size,
unsigned long attrs)
  {
-   unsigned int count = PAGE_ALIGN(size) >> PAGE_SHIFT;
-   struct vm_struct *area = find_vm_area(cpu_addr);
-
-   if (!is_vmalloc_addr(cpu_addr)) {
-   struct page *page = virt_to_page(cpu_addr);
-   return __iommu_dma_get_sgtable_page(sgt, page, size);
-   }
-
-   if (attrs & DMA_ATTR_FORCE_CONTIGUOUS) {
-   /*
-* DMA_ATTR_FORCE_CONTIGUOUS allocations are always remapped,
-* hence in the vmalloc space.
-*/
-   struct page *page = vmalloc_to_page(cpu_addr);
-   return __iommu_dma_get_sgtable_page(sgt, page, size);
-   }
+   struct page *page;
+   int ret;
  
-	if (WARN_ON(!area || !area->pages))

-   return -ENXIO;
+   if (is_vmalloc_addr(cpu_addr)) {
+   if (!(attrs & DMA_ATTR_FORCE_CONTIGUOUS))
+   return iommu_dma_get_sgtable_remap(sgt, cpu_addr, size);
+   page = vmalloc_to_page(cpu_addr);
+   } else
+   page = virt_to_page(cpu_addr);
  
-	return sg_alloc_table_from_pages(sgt, area->pages, count, 0, size,

-GFP_KERNEL);
+   ret = sg_alloc_table(sgt, 1, GFP_KERNEL);
+   if (!ret)
+   sg_set_page(sgt->sgl, page, PAGE_ALIGN(size), 0);
+   return ret;
  }
  
  static const struct dma_map_ops iommu_dma_ops = {



___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 08/21] dma-iommu: refactor iommu_dma_mmap

2019-04-09 Thread Robin Murphy

On 27/03/2019 08:04, Christoph Hellwig wrote:

Move the vm_area handling into __iommu_dma_mmap, which is renamed
to iommu_dma_mmap_remap.

Inline __iommu_dma_mmap_pfn into the main function to simplify the code
flow a bit.

Signed-off-by: Christoph Hellwig 
---
  drivers/iommu/dma-iommu.c | 50 ++-
  1 file changed, 18 insertions(+), 32 deletions(-)

diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
index d14fe9f8c692..43bd3c7e0f6b 100644
--- a/drivers/iommu/dma-iommu.c
+++ b/drivers/iommu/dma-iommu.c
@@ -597,23 +597,27 @@ static struct page **__iommu_dma_alloc(struct device 
*dev, size_t size,
  }
  
  /**

- * __iommu_dma_mmap - Map a buffer into provided user VMA
- * @pages: Array representing buffer from __iommu_dma_alloc()
+ * iommu_dma_mmap_remap - Map a remapped page array into provided user VMA
+ * @cpu_addr: virtual address of the memory to be remapped
   * @size: Size of buffer in bytes
   * @vma: VMA describing requested userspace mapping
   *
- * Maps the pages of the buffer in @pages into @vma. The caller is responsible
+ * Maps the pages pointed to by @cpu_addr into @vma. The caller is responsible
   * for verifying the correct size and protection of @vma beforehand.
   */
-static int __iommu_dma_mmap(struct page **pages, size_t size,
+static int iommu_dma_mmap_remap(void *cpu_addr, size_t size,
struct vm_area_struct *vma)
  {
+   struct vm_struct *area = find_vm_area(cpu_addr);
unsigned long uaddr = vma->vm_start;
unsigned int i, count = PAGE_ALIGN(size) >> PAGE_SHIFT;
int ret = -ENXIO;
  
+	if (WARN_ON(!area || !area->pages))

+   return -ENXIO;
+
for (i = vma->vm_pgoff; i < count && uaddr < vma->vm_end; i++) {
-   ret = vm_insert_page(vma, uaddr, pages[i]);
+   ret = vm_insert_page(vma, uaddr, area->pages[i]);
if (ret)
break;
uaddr += PAGE_SIZE;
@@ -1052,21 +1056,13 @@ static void iommu_dma_free(struct device *dev, size_t 
size, void *cpu_addr,
}
  }
  
-static int __iommu_dma_mmap_pfn(struct vm_area_struct *vma,

- unsigned long pfn, size_t size)
-{
-   return remap_pfn_range(vma, vma->vm_start, pfn + vma->vm_pgoff,
-  vma->vm_end - vma->vm_start,
-  vma->vm_page_prot);
-}
-
  static int iommu_dma_mmap(struct device *dev, struct vm_area_struct *vma,
void *cpu_addr, dma_addr_t dma_addr, size_t size,
unsigned long attrs)
  {
unsigned long nr_pages = PAGE_ALIGN(size) >> PAGE_SHIFT;
unsigned long off = vma->vm_pgoff;
-   struct vm_struct *area;
+   unsigned long pfn;
int ret;
  
  	vma->vm_page_prot = arch_dma_mmap_pgprot(dev, vma->vm_page_prot, attrs);

@@ -1077,25 +1073,15 @@ static int iommu_dma_mmap(struct device *dev, struct 
vm_area_struct *vma,
if (off >= nr_pages || vma_pages(vma) > nr_pages - off)
return -ENXIO;
  
-	if (!is_vmalloc_addr(cpu_addr)) {

-   unsigned long pfn = page_to_pfn(virt_to_page(cpu_addr));
-   return __iommu_dma_mmap_pfn(vma, pfn, size);
-   }
+   if (is_vmalloc_addr(cpu_addr)) {
+   if (!(attrs & DMA_ATTR_FORCE_CONTIGUOUS))
+   return iommu_dma_mmap_remap(cpu_addr, size, vma);
+   pfn = vmalloc_to_pfn(cpu_addr);
+   } else
+   pfn = page_to_pfn(virt_to_page(cpu_addr));


Nit: braces around the else clause for correct style.

Otherwise, this seems to make sense;

Reviewed-by: Robin Murphy 

  
-	if (attrs & DMA_ATTR_FORCE_CONTIGUOUS) {

-   /*
-* DMA_ATTR_FORCE_CONTIGUOUS allocations are always remapped,
-* hence in the vmalloc space.
-*/
-   unsigned long pfn = vmalloc_to_pfn(cpu_addr);
-   return __iommu_dma_mmap_pfn(vma, pfn, size);
-   }
-
-   area = find_vm_area(cpu_addr);
-   if (WARN_ON(!area || !area->pages))
-   return -ENXIO;
-
-   return __iommu_dma_mmap(area->pages, size, vma);
+   return remap_pfn_range(vma, vma->vm_start, pfn + vma->vm_pgoff,
+   vma_pages(vma) << PAGE_SHIFT, vma->vm_page_prot);
  }
  
  static int __iommu_dma_get_sgtable_page(struct sg_table *sgt, struct page *page,



___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: revert dma direct internals abuse

2019-04-09 Thread h...@lst.de
On Tue, Apr 09, 2019 at 02:17:40PM +, Thomas Hellstrom wrote:
> If that's the case, I think most of the graphics drivers will stop
> functioning. I don't think people would want that, and even if the
> graphics drivers are "to blame" due to not implementing the sync calls,
> I think the work involved to get things right is impressive if at all
> possible.

Note that this only affects external, untrusted devices.  But that
may include eGPU, so yes GPU folks finally need to up their game and
stop thinking they are above the law^H^H^Hinterface.

> There are two things that concerns me with dma_alloc_coherent:
> 
> 1) It seems to want pages mapped either in the kernel map or vmapped.
> Graphics drivers allocate huge amounts of memory, Typically up to 50%
> of system memory or more. In a 32 bit PAE system I'm afraid of running
> out of vmap space as well as not being able to allocate as much memory
> as I want. Perhaps a dma_alloc_coherent() interface that returns a page
> rather than a virtual address would do the trick.

We can't just simply export a page.  For devices that are not cache
coherent we need to remap the returned memory to be uncached.  In the
common cases that is either done by setting an uncached bit in the page
tables, or by using a special address space alias.  So the virtual
address to access the page matter, and we can't just kmap a random
page an expect it to be coherent.  If you want memory that is not
mapped into the kernel direct mapping and DMA to it you need to
use the proper DMA streaming interface and obey their rules.

> 2) Exporting using dma-buf. A page allocated using dma_alloc_coherent()
> for one device might not be coherent for another device. What happens
> if I allocate a page using dma_alloc_coherent() for device 1 and then
> want to map it using dma_map_page() for device 2?

The problem in this case isn't really the coherency - once a page
is mapped uncached it is 'coherent' for all devices, even those not
requiring it.  The problem is addressability - the DMA address for
the same memory might be different for different devices, and something
that looks contigous to one device that is using an IOMMU might not
for another one using the direct mapping.

We have the dma_get_sgtable API to map a piece of coherent memory
using the streaming APIs for another device, but it has all sorts of
problems.

That being said: your driver already uses the dma coherent API
under various circumstances, so you already have the above issues.

In the end swiotlb_nr_tbl() might be the best hint that some bounce
buffering could happen.  This isn't proper use of the API, but at
least a little better than your old intel_iommu_emabled check
and much better than we we have right now.  Something like this:

diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_drv.c 
b/drivers/gpu/drm/vmwgfx/vmwgfx_drv.c
index 6165fe2c4504..ff00bea026c5 100644
--- a/drivers/gpu/drm/vmwgfx/vmwgfx_drv.c
+++ b/drivers/gpu/drm/vmwgfx/vmwgfx_drv.c
@@ -545,21 +545,6 @@ static void vmw_get_initial_size(struct vmw_private 
*dev_priv)
dev_priv->initial_height = height;
 }
 
-/**
- * vmw_assume_iommu - Figure out whether coherent dma-remapping might be
- * taking place.
- * @dev: Pointer to the struct drm_device.
- *
- * Return: true if iommu present, false otherwise.
- */
-static bool vmw_assume_iommu(struct drm_device *dev)
-{
-   const struct dma_map_ops *ops = get_dma_ops(dev->dev);
-
-   return !dma_is_direct(ops) && ops &&
-   ops->map_page != dma_direct_map_page;
-}
-
 /**
  * vmw_dma_select_mode - Determine how DMA mappings should be set up for this
  * system.
@@ -581,25 +566,14 @@ static int vmw_dma_select_mode(struct vmw_private 
*dev_priv)
[vmw_dma_map_populate] = "Keeping DMA mappings.",
[vmw_dma_map_bind] = "Giving up DMA mappings early."};
 
-   if (vmw_force_coherent)
-   dev_priv->map_mode = vmw_dma_alloc_coherent;
-   else if (vmw_assume_iommu(dev_priv->dev))
-   dev_priv->map_mode = vmw_dma_map_populate;
-   else if (!vmw_force_iommu)
-   dev_priv->map_mode = vmw_dma_phys;
-   else if (IS_ENABLED(CONFIG_SWIOTLB) && swiotlb_nr_tbl())
+   if (vmw_force_coherent ||
+   (IS_ENABLED(CONFIG_SWIOTLB) && swiotlb_nr_tbl()))
dev_priv->map_mode = vmw_dma_alloc_coherent;
+   else if (vmw_restrict_iommu)
+   dev_priv->map_mode = vmw_dma_map_bind;
else
dev_priv->map_mode = vmw_dma_map_populate;
 
-   if (dev_priv->map_mode == vmw_dma_map_populate && vmw_restrict_iommu)
-   dev_priv->map_mode = vmw_dma_map_bind;
-
-   /* No TTM coherent page pool? FIXME: Ask TTM instead! */
-if (!(IS_ENABLED(CONFIG_SWIOTLB) || IS_ENABLED(CONFIG_INTEL_IOMMU)) &&
-   (dev_priv->map_mode == vmw_dma_alloc_coherent))
-   return -EINVAL;
-
DRM_INFO("DMA map mode: %s\n", names[dev_priv->map_mode]);
return 0;
 }


Re: [PATCH 01/18] drivers core: Add I/O ASID allocator

2019-04-09 Thread Andriy Shevchenko
On Tue, Apr 09, 2019 at 07:53:08AM -0700, Paul E. McKenney wrote:
> On Tue, Apr 09, 2019 at 01:30:30PM +0300, Andriy Shevchenko wrote:
> > On Tue, Apr 09, 2019 at 03:04:36AM -0700, Christoph Hellwig wrote:
> > > On Tue, Apr 09, 2019 at 01:00:49PM +0300, Andriy Shevchenko wrote:
> > > > I think it makes sense to add a helper macro to rcupdate.h
> > > > (and we have several cases in kernel that can utilize it)
> > > > 
> > > > #define kfree_non_null_rcu(ptr, rcu_head)   \
> > > > do {\
> > > > if (ptr)\
> > > > kfree_rcu(ptr, rcu_head);   \
> > > > } while (0)
> > > > 
> > > > as a more common pattern for resource deallocators.
> > > 
> > > I think that should move straight into kfree_rcu.  
> > 
> > Possible. I didn't dare to offer this due to lack of knowledge how it's 
> > used in
> > other places.
> > 
> > > In general
> > > we expect *free* to deal with NULL pointers transparently, so we
> > > should do so here as well.
> > 
> > Exactly my point, thanks.
> 
> As shown below?

Looks pretty much good to me, thanks!
Reviewed-by: Andriy Shevchenko 

> 
> And now that you mention it, it is a bit surprising that no one has
> complained before.  ;-)
> 
>   Thanx, Paul
> 
> 
> 
> commit 23ad938244968e9d2a8001a1c52887c113b182f6
> Author: Paul E. McKenney 
> Date:   Tue Apr 9 07:48:18 2019 -0700
> 
> rcu: Make kfree_rcu() ignore NULL pointers
> 
> This commit makes the kfree_rcu() macro's semantics be consistent
> with the likes of kfree() by adding a check for NULL pointers, so
> that kfree_rcu(NULL, ...) is a no-op.
> 
> Reported-by: Andriy Shevchenko 
> Reported-by: Christoph Hellwig 
> Signed-off-by: Paul E. McKenney 
> 
> diff --git a/include/linux/rcupdate.h b/include/linux/rcupdate.h
> index 922bb6848813..c68649b9bcec 100644
> --- a/include/linux/rcupdate.h
> +++ b/include/linux/rcupdate.h
> @@ -828,9 +828,13 @@ static inline notrace void 
> rcu_read_unlock_sched_notrace(void)
>   * The BUILD_BUG_ON check must not involve any function calls, hence the
>   * checks are done in macros here.
>   */
> -#define kfree_rcu(ptr, rcu_head) \
> - __kfree_rcu(&((ptr)->rcu_head), offsetof(typeof(*(ptr)), rcu_head))
> -
> +#define kfree_rcu(ptr, rhf)  \
> +do { \
> + typeof (ptr) ___p = (ptr);  \
> + \
> + if (___p)   \
> + __kfree_rcu(&((___p)->rhf), offsetof(typeof(*(ptr)), rhf)); \
> +} while (0)
>  
>  /*
>   * Place this after a lock-acquisition primitive to guarantee that
> 

-- 
With Best Regards,
Andy Shevchenko


___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 02/21] arm64/iommu: improve mmap bounds checking

2019-04-09 Thread Robin Murphy

On 07/04/2019 07:59, Christoph Hellwig wrote:

On Fri, Apr 05, 2019 at 06:30:52PM +0100, Robin Murphy wrote:

On 27/03/2019 08:04, Christoph Hellwig wrote:

The nr_pages checks should be done for all mmap requests, not just those
using remap_pfn_range.


Hmm, the logic in iommu_dma_mmap() inherently returns an error for the "off

= nr_pages" case already. It's also supposed to be robust against the

"vma_pages(vma) > nr_pages - off" condition, although by making the partial
mapping and treating it as a success, rather than doing nothing and
returning an error. What's the exact motivation here?


Have one error check at the front of the function that is identical
to the mmap checks in the other dma_map_ops instances so that:

  a) we get the same error behavior for partial requests everywhere
  b) we can lift these checks into common code in the next round.



Fair enough, but in that case why isn't the dma_mmap_from_coherent() 
path also covered?


Robin.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 07/21] dma-iommu: move the arm64 wrappers to common code

2019-04-09 Thread Robin Murphy

On 27/03/2019 08:04, Christoph Hellwig wrote:
[...]

@@ -649,19 +696,44 @@ static dma_addr_t __iommu_dma_map(struct device *dev, 
phys_addr_t phys,
return iova + iova_off;
  }
  
-dma_addr_t iommu_dma_map_page(struct device *dev, struct page *page,

+static dma_addr_t __iommu_dma_map_page(struct device *dev, struct page *page,
unsigned long offset, size_t size, int prot)
  {
return __iommu_dma_map(dev, page_to_phys(page) + offset, size, prot,
iommu_get_dma_domain(dev));
  }
  
-void iommu_dma_unmap_page(struct device *dev, dma_addr_t handle, size_t size,

-   enum dma_data_direction dir, unsigned long attrs)
+static void __iommu_dma_unmap_page(struct device *dev, dma_addr_t handle,
+   size_t size, enum dma_data_direction dir, unsigned long attrs)
  {
__iommu_dma_unmap(iommu_get_dma_domain(dev), handle, size);
  }
  
+static dma_addr_t iommu_dma_map_page(struct device *dev, struct page *page,

+   unsigned long offset, size_t size, enum dma_data_direction dir,
+   unsigned long attrs)
+{
+   phys_addr_t phys = page_to_phys(page) + offset;
+   bool coherent = dev_is_dma_coherent(dev);
+   dma_addr_t dma_handle;
+
+   dma_handle =__iommu_dma_map(dev, phys, size,
+   dma_info_to_prot(dir, coherent, attrs),
+   iommu_get_dma_domain(dev));
+   if (!coherent && !(attrs & DMA_ATTR_SKIP_CPU_SYNC) &&
+   dma_handle != DMA_MAPPING_ERROR)
+   arch_sync_dma_for_device(dev, phys, size, dir);
+   return dma_handle;
+}
+
+static void iommu_dma_unmap_page(struct device *dev, dma_addr_t dma_handle,
+   size_t size, enum dma_data_direction dir, unsigned long attrs)
+{
+   if (!(attrs & DMA_ATTR_SKIP_CPU_SYNC))
+   iommu_dma_sync_single_for_cpu(dev, dma_handle, size, dir);
+   __iommu_dma_unmap(iommu_get_domain_for_dev(dev), dma_handle, size);


That wants to be iommu_get_dma_domain() there to minimise the overhead. 
In fact, I guess this could now be streamlined a bit in the style of 
iommu_dma_map_page() above - i.e. avoid doing a second domain lookup in 
the sync case - but that can happen later (if indeed you haven't already).



+}
+
  /*
   * Prepare a successfully-mapped scatterlist to give back to the caller.
   *


[...]


@@ -843,12 +923,257 @@ dma_addr_t iommu_dma_map_resource(struct device *dev, 
phys_addr_t phys,
iommu_get_dma_domain(dev));
  }
  
-void iommu_dma_unmap_resource(struct device *dev, dma_addr_t handle,

+static void iommu_dma_unmap_resource(struct device *dev, dma_addr_t handle,
size_t size, enum dma_data_direction dir, unsigned long attrs)
  {
__iommu_dma_unmap(iommu_get_dma_domain(dev), handle, size);
  }
  
+static void *iommu_dma_alloc(struct device *dev, size_t size,

+   dma_addr_t *handle, gfp_t gfp, unsigned long attrs)
+{
+   bool coherent = dev_is_dma_coherent(dev);
+   int ioprot = dma_info_to_prot(DMA_BIDIRECTIONAL, coherent, attrs);
+   size_t iosize = size;
+   void *addr;
+
+   size = PAGE_ALIGN(size);
+
+   /*
+* Some drivers rely on this, and we probably don't want the
+* possibility of stale kernel data being read by devices anyway.
+*/


That comment can probably be dropped now that zeroing is official API 
behaviour.



+   gfp |= __GFP_ZERO;


[...]


+/*
+ * The IOMMU core code allocates the default DMA domain, which the underlying
+ * IOMMU driver needs to support via the dma-iommu layer.
+ */
+void iommu_setup_dma_ops(struct device *dev, u64 dma_base, u64 size,
+   const struct iommu_ops *ops)


There's really no need to even pass @ops in here - in the existing arm64 
logic it's merely a token representing "should I do IOMMU setup?", and 
after this refactoring that's already been decided by our caller here.



+{
+   struct iommu_domain *domain = iommu_get_domain_for_dev(dev);
+
+   if (!domain || domain->type != IOMMU_DOMAIN_DMA)


This change means we now spam the logs with spurious warnings when 
configured for passthrough, which is not what we want.



+   goto out_err;
+   if (iommu_dma_init_domain(domain, dma_base, size, dev))
+   goto out_err;
+
+   dev->dma_ops = _dma_ops;
+   return;
+out_err:
+pr_warn("Failed to set up IOMMU for device %s; retaining platform DMA 
ops\n",
+dev_name(dev));
+}
+
  static struct iommu_dma_msi_page *iommu_dma_get_msi_page(struct device *dev,
phys_addr_t msi_addr, struct iommu_domain *domain)
  {
@@ -921,3 +1246,5 @@ void iommu_dma_map_msi_msg(int irq, struct msi_msg *msg)
msg->address_lo += lower_32_bits(msi_page->iova);
}
  }
+
+arch_initcall(iova_cache_get);


This feels a bit funky - TBH I'd rather just keep iommu_dma_init() 
around and make it static, if only for the sake of 

Re: [PATCH 18/18] iommu/vt-d: Add svm/sva invalidate function

2019-04-09 Thread Andriy Shevchenko
On Mon, Apr 08, 2019 at 04:59:33PM -0700, Jacob Pan wrote:
> When Shared Virtual Address (SVA) is enabled for a guest OS via
> vIOMMU, we need to provide invalidation support at IOMMU API and driver
> level. This patch adds Intel VT-d specific function to implement
> iommu passdown invalidate API for shared virtual address.
> 
> The use case is for supporting caching structure invalidation
> of assigned SVM capable devices. Emulated IOMMU exposes queue
> invalidation capability and passes down all descriptors from the guest
> to the physical IOMMU.
> 
> The assumption is that guest to host device ID mapping should be
> resolved prior to calling IOMMU driver. Based on the device handle,
> host IOMMU driver can replace certain fields before submit to the
> invalidation queue.

> +static int intel_iommu_sva_invalidate(struct iommu_domain *domain,
> + struct device *dev, struct iommu_cache_invalidate_info 
> *inv_info)
> +{
> + struct dmar_domain *dmar_domain = to_dmar_domain(domain);
> + struct device_domain_info *info;
> + struct intel_iommu *iommu;
> + unsigned long flags;
> + int cache_type;
> + u8 bus, devfn;
> + u16 did, sid;
> + int ret = 0;
> + u64 granu;
> + u64 size;
> +
> + if (!inv_info || !dmar_domain ||
> + inv_info->version != IOMMU_CACHE_INVALIDATE_INFO_VERSION_1)
> + return -EINVAL;
> +
> + iommu = device_to_iommu(dev, , );
> + if (!iommu)
> + return -ENODEV;
> +

> + if (!dev || !dev_is_pci(dev))
> + return -ENODEV;

How dev is used in above call? Can be dev NULL there optional and give non-NULL
iommu?

> + switch (1 << cache_type) {

BIT() ?

> + case IOMMU_CACHE_INV_TYPE_IOTLB:
> + if (size && (inv_info->addr_info.addr & ((1 << 
> (VTD_PAGE_SHIFT + size)) - 1))) {

BIT() ?

> + pr_err("Address out of range, 0x%llx, size 
> order %llu\n",
> + inv_info->addr_info.addr, size);
> + ret = -ERANGE;
> + goto out_unlock;
> + }
> +
> + qi_flush_piotlb(iommu, did, 
> mm_to_dma_pfn(inv_info->addr_info.addr),
> + inv_info->addr_info.pasid,
> + size, granu);
> +
> + /*
> +  * Always flush device IOTLB if ATS is enabled since 
> guest
> +  * vIOMMU exposes CM = 1, no device IOTLB flush will be 
> passed
> +  * down. REVISIT: cannot assume Linux guest
> +  */
> + if (info->ats_enabled) {
> + qi_flush_dev_piotlb(iommu, sid, info->pfsid,
> + inv_info->addr_info.pasid, 
> info->ats_qdep,
> + inv_info->addr_info.addr, size,
> + granu);
> + }
> + break;
> + case IOMMU_CACHE_INV_TYPE_DEV_IOTLB:
> + if (info->ats_enabled) {
> + qi_flush_dev_piotlb(iommu, sid, info->pfsid,
> + inv_info->addr_info.pasid, 
> info->ats_qdep,
> + inv_info->addr_info.addr, size,
> + granu);
> + } else
> + pr_warn("Passdown device IOTLB flush w/o 
> ATS!\n");
> +
> + break;
> + case IOMMU_CACHE_INV_TYPE_PASID:
> + qi_flush_pasid_cache(iommu, did, granu, 
> inv_info->pasid);
> +
> + break;
> + default:
> + dev_err(dev, "Unsupported IOMMU invalidation type %d\n",
> + cache_type);
> + ret = -EINVAL;
> + }

> +out_unlock:
> + spin_unlock(>lock);
> + spin_unlock_irqrestore(_domain_lock, flags);
> +
> + return ret;
> +}

-- 
With Best Regards,
Andy Shevchenko


___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 15/18] iommu/vt-d: Add bind guest PASID support

2019-04-09 Thread Andriy Shevchenko
On Mon, Apr 08, 2019 at 04:59:30PM -0700, Jacob Pan wrote:
> When supporting guest SVA with emulated IOMMU, the guest PASID
> table is shadowed in VMM. Updates to guest vIOMMU PASID table
> will result in PASID cache flush which will be passed down to
> the host as bind guest PASID calls.
> 
> For the SL page tables, it will be harvested from device's
> default domain (request w/o PASID), or aux domain in case of
> mediated device.
> 
> .-.  .---.
> |   vIOMMU|  | Guest process CR3, FL only|
> | |  '---'
> ./
> | PASID Entry |--- PASID cache flush -
> '-'   |
> | |   V
> | |CR3 in GPA
> '-'
> Guest
> --| Shadow |--|
>   vv  v
> Host
> .-.  .--.
> |   pIOMMU|  | Bind FL for GVA-GPA  |
> | |  '--'
> ./  |
> | PASID Entry | V (Nested xlate)
> '\.--.
> | |   |SL for GPA-HPA, default domain|
> | |   '--'
> '-'
> Where:
>  - FL = First level/stage one page tables
>  - SL = Second level/stage two page tables

> + list_for_each_entry(sdev, >devs, list) {
> + if (dev == sdev->dev) {
...
> + }
> + }

This pattern occurs at least twice, perhaps something like

#define for_each_svm_dev()  \
list_for_each_entry()   \
if (dev != sdev->dev) {} else

?

-- 
With Best Regards,
Andy Shevchenko


___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 16/18] iommu: add max num of cache and granu types

2019-04-09 Thread Andriy Shevchenko
On Mon, Apr 08, 2019 at 04:59:31PM -0700, Jacob Pan wrote:

Commit message?

> Signed-off-by: Jacob Pan 
> ---
>  include/uapi/linux/iommu.h | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/include/uapi/linux/iommu.h b/include/uapi/linux/iommu.h
> index 9344cbb..59569d6 100644
> --- a/include/uapi/linux/iommu.h
> +++ b/include/uapi/linux/iommu.h
> @@ -197,6 +197,7 @@ struct iommu_inv_addr_info {
>   __u64   granule_size;
>   __u64   nb_granules;
>  };
> +#define NR_IOMMU_CACHE_INVAL_GRANU   (3)
>  
>  /**
>   * First level/stage invalidation information
> @@ -228,6 +229,7 @@ struct iommu_cache_invalidate_info {
>   struct iommu_inv_addr_info addr_info;
>   };
>  };
> +#define NR_IOMMU_CACHE_TYPE  (3)

+ blank line?

>  /**
>   * struct gpasid_bind_data - Information about device and guest PASID binding
>   * @gcr3:Guest CR3 value from guest mm
> -- 
> 2.7.4
> 

-- 
With Best Regards,
Andy Shevchenko


___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: revert dma direct internals abuse

2019-04-09 Thread Thomas Hellstrom via iommu
On Tue, 2019-04-09 at 15:31 +0200, h...@lst.de wrote:
> On Tue, Apr 09, 2019 at 01:04:51PM +, Thomas Hellstrom wrote:
> > On the VMware platform we have two possible vIOMMUS, the AMD iommu
> > and
> > Intel VTD, Given those conditions I belive the patch is
> > functionally
> > correct. We can't cover the AMD case with intel_iommu_enabled.
> > Furthermore the only form of incoherency that can affect our
> > graphics
> > device is someone forcing SWIOTLB in which case that person would
> > be
> > happier with software rendering. In any case, observing the fact
> > that
> > the direct_ops are not used makes sure that SWIOTLB is not used.
> > Knowing that we're on the VMware platform, we're coherent and can
> > safely have the dma layer do dma address translation for us. All
> > this
> > information was not explicilty written in the changelog, no.
> 
> We have a series pending that might bounce your buffers even when
> using the Intel IOMMU, which should eventually also find its way
> to other IOMMUs:
> 
> 
> https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.linuxfoundation.org%2Fpipermail%2Fiommu%2F2019-March%2F034090.htmldata=02%7C01%7Cthellstrom%40vmware.com%7C9933ee7b805842607ea908d6bcefc505%7Cb39138ca3cee4b4aa4d6cd83d9dd62f0%7C0%7C0%7C636904135345010687sdata=Px500%2B1FjL%2FZLedUdbAXz4a%2BT5DaZBFf6wnesTyFvZY%3Dreserved=0

If that's the case, I think most of the graphics drivers will stop
functioning. I don't think people would want that, and even if the
graphics drivers are "to blame" due to not implementing the sync calls,
I think the work involved to get things right is impressive if at all
possible.

> 
> > In any case, assuming that that patch is reverted due to the
> > layering
> > violation, Are you willing to help out with a small API to detect
> > the
> > situation where streaming DMA is incoherent?
> 
> The short but sad answer is that we can't ever guarantee that you
> can skip the dma_*sync_* calls.  There are too many factors in play
> that might require it at any time - working around unaligned
> addresses
> in iommus, CPUs that are coherent for some device and not some,
> addressing
> limitations both in physical CPUs and VMs (see the various "secure
> VM"
> concepts floating around at the moment).
> 
> If you want to avoid the dma_*sync_* calls you must use
> dma_alloc_coherent to allocate the memory.  Note that the memory for
> dma_alloc_coherent actually comes from the normal page pool most of
> the time, and for certain on x86, which seems to be what you care
> about.  The times of it dipping into the tiny swiotlb pool are long
> gone.  So at least for you I see absolutely no reason to not simply
> always use dma_alloc_coherent to start with.  For other uses that
> involve platforms without DMA coherent devices like arm the tradeoffs
> might be a little different.

There are two things that concerns me with dma_alloc_coherent:

1) It seems to want pages mapped either in the kernel map or vmapped.
Graphics drivers allocate huge amounts of memory, Typically up to 50%
of system memory or more. In a 32 bit PAE system I'm afraid of running
out of vmap space as well as not being able to allocate as much memory
as I want. Perhaps a dma_alloc_coherent() interface that returns a page
rather than a virtual address would do the trick.

2) Exporting using dma-buf. A page allocated using dma_alloc_coherent()
for one device might not be coherent for another device. What happens
if I allocate a page using dma_alloc_coherent() for device 1 and then
want to map it using dma_map_page() for device 2?

Thanks,
Thomas



___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


status of the calgary iommu driver

2019-04-09 Thread Christoph Hellwig
Hi Muli and Jon,

do you know if there are user of systems with the Calgary iommu
around still? It seems like the last non-drive by changes to it
are from 2010 and I'm not sure how common these systems were.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: per-device dma_map_ops for intel-iommu?

2019-04-09 Thread David Woodhouse
On Tue, 2019-04-09 at 15:59 +0200, Christoph Hellwig wrote:
> Hi David and Joerg,
> 
> do you remember a good reason why intel-iommu is not using per-device
> dma_map_ops like the AMD iommu or the various ARM iommus?
> 
> Right now intel-iommu.c contains a half-asses reimplementation of the
> dma direct code for the iommu_no_mapping() case, and it would seem
> much nicer to just fall back to that case and not even call into
> intel-iommu in that case.

Other than the complexities about passthrough mode and various "oh shit
we forgot to actually test that iommu+gfx actually works before
shipping hardware" type of quirks that bypass the IOMMU for certain
devices — and retpolines, which I think you already dealt with — no, no
good reason that I recall.


smime.p7s
Description: S/MIME cryptographic signature
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

per-device dma_map_ops for intel-iommu?

2019-04-09 Thread Christoph Hellwig
Hi David and Joerg,

do you remember a good reason why intel-iommu is not using per-device
dma_map_ops like the AMD iommu or the various ARM iommus?

Right now intel-iommu.c contains a half-asses reimplementation of the
dma direct code for the iommu_no_mapping() case, and it would seem
much nicer to just fall back to that case and not even call into
intel-iommu in that case.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: revert dma direct internals abuse

2019-04-09 Thread h...@lst.de
On Tue, Apr 09, 2019 at 01:04:51PM +, Thomas Hellstrom wrote:
> On the VMware platform we have two possible vIOMMUS, the AMD iommu and
> Intel VTD, Given those conditions I belive the patch is functionally
> correct. We can't cover the AMD case with intel_iommu_enabled.
> Furthermore the only form of incoherency that can affect our graphics
> device is someone forcing SWIOTLB in which case that person would be
> happier with software rendering. In any case, observing the fact that
> the direct_ops are not used makes sure that SWIOTLB is not used.
> Knowing that we're on the VMware platform, we're coherent and can
> safely have the dma layer do dma address translation for us. All this
> information was not explicilty written in the changelog, no.

We have a series pending that might bounce your buffers even when
using the Intel IOMMU, which should eventually also find its way
to other IOMMUs:

https://lists.linuxfoundation.org/pipermail/iommu/2019-March/034090.html

> In any case, assuming that that patch is reverted due to the layering
> violation, Are you willing to help out with a small API to detect the
> situation where streaming DMA is incoherent?

The short but sad answer is that we can't ever guarantee that you
can skip the dma_*sync_* calls.  There are too many factors in play
that might require it at any time - working around unaligned addresses
in iommus, CPUs that are coherent for some device and not some, addressing
limitations both in physical CPUs and VMs (see the various "secure VM"
concepts floating around at the moment).

If you want to avoid the dma_*sync_* calls you must use
dma_alloc_coherent to allocate the memory.  Note that the memory for
dma_alloc_coherent actually comes from the normal page pool most of
the time, and for certain on x86, which seems to be what you care
about.  The times of it dipping into the tiny swiotlb pool are long
gone.  So at least for you I see absolutely no reason to not simply
always use dma_alloc_coherent to start with.  For other uses that
involve platforms without DMA coherent devices like arm the tradeoffs
might be a little different.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: revert dma direct internals abuse

2019-04-09 Thread Thomas Hellstrom via iommu
On Tue, 2019-04-09 at 11:57 +0200, h...@lst.de wrote:
> On Mon, Apr 08, 2019 at 06:47:52PM +, Thomas Hellstrom wrote:
> > We HAVE discussed our needs, although admittedly some of my emails
> > ended up unanswered.
> 
> And than you haven't followed up, and instead ignored the layering
> instructions and just commited a broken patch?

Yes, I'm really sorry for not observing the layering instructions. To
be completely honest I didn't observe that layering warning, rather
than ignoring it. The patch doesn't claim to be generic and I believe
that on the VMware platform it is functionally correct, please see
below. I also would like not to focus on who did what and why. I can't
see how that will help this discussion moving forward.

> 
> > We've as you're well aware of had a discussion with the other
> > subsystems doing user-space DMA-buffers wanting this functionality
> > from
> > the dma api (AMD graphics and RDMA people IIRC). that is a bool
> > that
> > tells us whether streaming dma mappings are coherent, and I
> > described
> > in detail why we couldn't use the dma_sync_* API and
> > dma_alloc_coherent().
> 
> And that is not at all what you either check or claim to do in the
> changelog (which btw, are two different things).
> 
> > The other option we have is to just fail miserably without messages
> > if
> > streaming DMA is not coherent, which I think the other drivers
> > might
> > do... That's all I'm trying to avoid here. I'd much prefer to have
> > the
> > dma API export this bool.
> 
> Both DMA direct and non-DMA direct streaming mappings can be either
> coherent or incoherent, so your patch doesn't archive that.  The
> commit log claims the following:
> 
> "drm/vmwgfx: Improve on IOMMU detection
> 
>  instead of relying on intel_iommu_enabled, use the fact that the
>  dma_map_ops::map_page != dma_direct_map_page"
> 
> which has nothing to do with the fact that streaming mappings are
> coherent.  It also is incorrect as there are direct mapping that
> do not use dma_direct_map_page (e.g. on ARM, or x86 with VMD).

On the VMware platform we have two possible vIOMMUS, the AMD iommu and
Intel VTD, Given those conditions I belive the patch is functionally
correct. We can't cover the AMD case with intel_iommu_enabled.
Furthermore the only form of incoherency that can affect our graphics
device is someone forcing SWIOTLB in which case that person would be
happier with software rendering. In any case, observing the fact that
the direct_ops are not used makes sure that SWIOTLB is not used.
Knowing that we're on the VMware platform, we're coherent and can
safely have the dma layer do dma address translation for us. All this
information was not explicilty written in the changelog, no.


In any case, assuming that that patch is reverted due to the layering
violation, Are you willing to help out with a small API to detect the
situation where streaming DMA is incoherent?

/Thomas


___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v5 6/6] x86/iommu: add support for generic boot option iommu.dma_mode

2019-04-09 Thread Zhen Lei
The following equivalence or replacement relationship exists:
iommu=pt <--> iommu.dma_mode=passthrough.
iommu=nopt can be replaced with iommu.dma_mode=lazy.
intel_iommu=strict <--> iommu.dma_mode=strict.
amd_iommu=fullflush <--> iommu.dma_mode=strict.

Signed-off-by: Zhen Lei 
---
 Documentation/admin-guide/kernel-parameters.txt |  6 ++---
 arch/ia64/include/asm/iommu.h   |  2 --
 arch/ia64/kernel/pci-dma.c  |  2 --
 arch/x86/include/asm/iommu.h|  1 -
 arch/x86/kernel/pci-dma.c   | 35 -
 drivers/iommu/Kconfig   |  2 +-
 drivers/iommu/amd_iommu.c   | 10 +++
 drivers/iommu/amd_iommu_init.c  |  4 +--
 drivers/iommu/amd_iommu_types.h |  6 -
 drivers/iommu/intel-iommu.c |  9 +++
 10 files changed, 31 insertions(+), 46 deletions(-)

diff --git a/Documentation/admin-guide/kernel-parameters.txt 
b/Documentation/admin-guide/kernel-parameters.txt
index 176f96032d9d62a..ca6edc529d6a614 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -1815,9 +1815,9 @@
options(such as CONFIG_IOMMU_DEFAULT_PASSTHROUGH) to
choose which mode to be used.
Note: For historical reasons, ARM64/S390/PPC/X86 have
-   their specific options. Currently, only ARM64/S390/PPC
-   support this boot option, and hope other ARCHs to use
-   this as generic boot option.
+   their specific options, but strongly recommended switch
+   to use this one, the new ARCHs should use this generic
+   boot option.
passthrough
Configure DMA to bypass the IOMMU by default.
lazy
diff --git a/arch/ia64/include/asm/iommu.h b/arch/ia64/include/asm/iommu.h
index 7429a72f3f92199..92aceef63710861 100644
--- a/arch/ia64/include/asm/iommu.h
+++ b/arch/ia64/include/asm/iommu.h
@@ -8,10 +8,8 @@
 extern void no_iommu_init(void);
 #ifdef CONFIG_INTEL_IOMMU
 extern int force_iommu, no_iommu;
-extern int iommu_pass_through;
 extern int iommu_detected;
 #else
-#define iommu_pass_through (0)
 #define no_iommu   (1)
 #define iommu_detected (0)
 #endif
diff --git a/arch/ia64/kernel/pci-dma.c b/arch/ia64/kernel/pci-dma.c
index fe988c49f01ce6a..f5d49cd3fbb01a9 100644
--- a/arch/ia64/kernel/pci-dma.c
+++ b/arch/ia64/kernel/pci-dma.c
@@ -22,8 +22,6 @@
 int force_iommu __read_mostly;
 #endif
 
-int iommu_pass_through;
-
 static int __init pci_iommu_init(void)
 {
if (iommu_detected)
diff --git a/arch/x86/include/asm/iommu.h b/arch/x86/include/asm/iommu.h
index baedab8ac5385f7..b91623d521d9f0f 100644
--- a/arch/x86/include/asm/iommu.h
+++ b/arch/x86/include/asm/iommu.h
@@ -4,7 +4,6 @@
 
 extern int force_iommu, no_iommu;
 extern int iommu_detected;
-extern int iommu_pass_through;
 
 /* 10 seconds */
 #define DMAR_OPERATION_TIMEOUT ((cycles_t) tsc_khz*10*1000)
diff --git a/arch/x86/kernel/pci-dma.c b/arch/x86/kernel/pci-dma.c
index d460998ae828514..fc64928e47cb860 100644
--- a/arch/x86/kernel/pci-dma.c
+++ b/arch/x86/kernel/pci-dma.c
@@ -6,6 +6,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include 
 #include 
@@ -34,21 +35,6 @@
 /* Set this to 1 if there is a HW IOMMU in the system */
 int iommu_detected __read_mostly = 0;
 
-/*
- * This variable becomes 1 if iommu=pt is passed on the kernel command line.
- * If this variable is 1, IOMMU implementations do no DMA translation for
- * devices and allow every device to access to whole physical memory. This is
- * useful if a user wants to use an IOMMU only for KVM device assignment to
- * guests and not for driver dma translation.
- * It is also possible to disable by default in kernel config, and enable with
- * iommu=nopt at boot time.
- */
-#ifdef CONFIG_IOMMU_DEFAULT_PASSTHROUGH
-int iommu_pass_through __read_mostly = 1;
-#else
-int iommu_pass_through __read_mostly;
-#endif
-
 extern struct iommu_table_entry __iommu_table[], __iommu_table_end[];
 
 /* Dummy device used for NULL arguments (normally ISA). */
@@ -139,10 +125,23 @@ static __init int iommu_setup(char *p)
if (!strncmp(p, "soft", 4))
swiotlb = 1;
 #endif
+
+   /*
+* IOMMU implementations do no DMA translation for devices and
+* allow every device to access to whole physical memory. This
+* is useful if a user wants to use an IOMMU only for KVM
+* device assignment to guests and not for driver dma
+* translation.
+*/
if (!strncmp(p, "pt", 2))
-   iommu_pass_through = 1;
-   if (!strncmp(p, "nopt", 4))
-   

[PATCH v5 4/6] s390/pci: add support for generic boot option iommu.dma_mode

2019-04-09 Thread Zhen Lei
s390_iommu=strict is equivalent to iommu.dma_mode=strict.

Signed-off-by: Zhen Lei 
---
 Documentation/admin-guide/kernel-parameters.txt |  6 +++---
 arch/s390/pci/pci_dma.c | 14 +++---
 drivers/iommu/Kconfig   |  1 +
 3 files changed, 11 insertions(+), 10 deletions(-)

diff --git a/Documentation/admin-guide/kernel-parameters.txt 
b/Documentation/admin-guide/kernel-parameters.txt
index 92d1b3151d003c2..ab8e3c4798c0a2a 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -1815,9 +1815,9 @@
options(such as CONFIG_IOMMU_DEFAULT_PASSTHROUGH) to
choose which mode to be used.
Note: For historical reasons, ARM64/S390/PPC/X86 have
-   their specific options. Currently, only ARM64 support
-   this boot option, and hope other ARCHs to use this as
-   generic boot option.
+   their specific options. Currently, only ARM64/S390
+   support this boot option, and hope other ARCHs to use
+   this as generic boot option.
passthrough
Configure DMA to bypass the IOMMU by default.
lazy
diff --git a/arch/s390/pci/pci_dma.c b/arch/s390/pci/pci_dma.c
index 9e52d1527f71495..f658ca41547eed5 100644
--- a/arch/s390/pci/pci_dma.c
+++ b/arch/s390/pci/pci_dma.c
@@ -9,6 +9,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -17,7 +18,6 @@
 
 static struct kmem_cache *dma_region_table_cache;
 static struct kmem_cache *dma_page_table_cache;
-static int s390_iommu_strict;
 
 static int zpci_refresh_global(struct zpci_dev *zdev)
 {
@@ -193,13 +193,13 @@ static int __dma_purge_tlb(struct zpci_dev *zdev, 
dma_addr_t dma_addr,
if (!zdev->tlb_refresh)
return 0;
} else {
-   if (!s390_iommu_strict)
+   if (!IOMMU_DMA_MODE_IS_STRICT())
return 0;
}
 
ret = zpci_refresh_trans((u64) zdev->fh << 32, dma_addr,
 PAGE_ALIGN(size));
-   if (ret == -ENOMEM && !s390_iommu_strict) {
+   if (ret == -ENOMEM && !IOMMU_DMA_MODE_IS_STRICT()) {
/* enable the hypervisor to free some resources */
if (zpci_refresh_global(zdev))
goto out;
@@ -278,7 +278,7 @@ static dma_addr_t dma_alloc_address(struct device *dev, int 
size)
spin_lock_irqsave(>iommu_bitmap_lock, flags);
offset = __dma_alloc_iommu(dev, zdev->next_bit, size);
if (offset == -1) {
-   if (!s390_iommu_strict) {
+   if (!IOMMU_DMA_MODE_IS_STRICT()) {
/* global flush before DMA addresses are reused */
if (zpci_refresh_global(zdev))
goto out_error;
@@ -313,7 +313,7 @@ static void dma_free_address(struct device *dev, dma_addr_t 
dma_addr, int size)
if (!zdev->iommu_bitmap)
goto out;
 
-   if (s390_iommu_strict)
+   if (IOMMU_DMA_MODE_IS_STRICT())
bitmap_clear(zdev->iommu_bitmap, offset, size);
else
bitmap_set(zdev->lazy_bitmap, offset, size);
@@ -584,7 +584,7 @@ int zpci_dma_init_device(struct zpci_dev *zdev)
rc = -ENOMEM;
goto free_dma_table;
}
-   if (!s390_iommu_strict) {
+   if (!IOMMU_DMA_MODE_IS_STRICT()) {
zdev->lazy_bitmap = vzalloc(zdev->iommu_pages / 8);
if (!zdev->lazy_bitmap) {
rc = -ENOMEM;
@@ -675,7 +675,7 @@ void zpci_dma_exit(void)
 static int __init s390_iommu_setup(char *str)
 {
if (!strncmp(str, "strict", 6))
-   s390_iommu_strict = 1;
+   iommu_default_dma_mode_set(IOMMU_DMA_MODE_STRICT);
return 0;
 }
 
diff --git a/drivers/iommu/Kconfig b/drivers/iommu/Kconfig
index 1986f9767da488b..b7173b106cd816a 100644
--- a/drivers/iommu/Kconfig
+++ b/drivers/iommu/Kconfig
@@ -77,6 +77,7 @@ config IOMMU_DEBUGFS
 choice
prompt "IOMMU dma mode"
depends on IOMMU_API
+   default IOMMU_DEFAULT_LAZY if S390_IOMMU
default IOMMU_DEFAULT_STRICT
help
  This option allows IOMMU dma mode to be chose at build time, to
-- 
1.8.3


___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v5 5/6] powernv/iommu: add support for generic boot option iommu.dma_mode

2019-04-09 Thread Zhen Lei
iommu=nobypass can be replaced with iommu.dma_mode=strict.

Signed-off-by: Zhen Lei 
---
 Documentation/admin-guide/kernel-parameters.txt | 2 +-
 arch/powerpc/platforms/powernv/pci-ioda.c   | 5 ++---
 drivers/iommu/Kconfig   | 1 +
 3 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/Documentation/admin-guide/kernel-parameters.txt 
b/Documentation/admin-guide/kernel-parameters.txt
index ab8e3c4798c0a2a..176f96032d9d62a 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -1815,7 +1815,7 @@
options(such as CONFIG_IOMMU_DEFAULT_PASSTHROUGH) to
choose which mode to be used.
Note: For historical reasons, ARM64/S390/PPC/X86 have
-   their specific options. Currently, only ARM64/S390
+   their specific options. Currently, only ARM64/S390/PPC
support this boot option, and hope other ARCHs to use
this as generic boot option.
passthrough
diff --git a/arch/powerpc/platforms/powernv/pci-ioda.c 
b/arch/powerpc/platforms/powernv/pci-ioda.c
index 3ead4c237ed0ec9..8862885d866418f 100644
--- a/arch/powerpc/platforms/powernv/pci-ioda.c
+++ b/arch/powerpc/platforms/powernv/pci-ioda.c
@@ -85,7 +85,6 @@ void pe_level_printk(const struct pnv_ioda_pe *pe, const char 
*level,
va_end(args);
 }
 
-static bool pnv_iommu_bypass_disabled __read_mostly;
 static bool pci_reset_phbs __read_mostly;
 
 static int __init iommu_setup(char *str)
@@ -95,7 +94,7 @@ static int __init iommu_setup(char *str)
 
while (*str) {
if (!strncmp(str, "nobypass", 8)) {
-   pnv_iommu_bypass_disabled = true;
+   iommu_default_dma_mode_set(IOMMU_DMA_MODE_STRICT);
pr_info("PowerNV: IOMMU bypass window disabled.\n");
break;
}
@@ -2456,7 +2455,7 @@ static long pnv_pci_ioda2_setup_default_config(struct 
pnv_ioda_pe *pe)
return rc;
}
 
-   if (!pnv_iommu_bypass_disabled)
+   if (IOMMU_DMA_MODE_IS_PASSTHROUGH())
pnv_pci_ioda2_set_bypass(pe, true);
 
return 0;
diff --git a/drivers/iommu/Kconfig b/drivers/iommu/Kconfig
index b7173b106cd816a..5dca666b22e6cd5 100644
--- a/drivers/iommu/Kconfig
+++ b/drivers/iommu/Kconfig
@@ -77,6 +77,7 @@ config IOMMU_DEBUGFS
 choice
prompt "IOMMU dma mode"
depends on IOMMU_API
+   default IOMMU_DEFAULT_PASSTHROUGH if (PPC_POWERNV && PCI)
default IOMMU_DEFAULT_LAZY if S390_IOMMU
default IOMMU_DEFAULT_STRICT
help
-- 
1.8.3


___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v5 1/6] iommu: add generic boot option iommu.dma_mode

2019-04-09 Thread Zhen Lei
Currently the IOMMU dma contains 3 modes: passthrough, lazy, strict. The
passthrough mode bypass the IOMMU, the lazy mode defer the invalidation
of hardware TLBs, and the strict mode invalidate IOMMU hardware TLBs
synchronously. The three modes are mutually exclusive. But the current
boot options are confused, such as: iommu.passthrough and iommu.strict,
because they are no good to be coexist. So add iommu.dma_mode.

Signed-off-by: Zhen Lei 
---
 Documentation/admin-guide/kernel-parameters.txt | 19 
 drivers/iommu/iommu.c   | 59 -
 include/linux/iommu.h   |  5 +++
 3 files changed, 71 insertions(+), 12 deletions(-)

diff --git a/Documentation/admin-guide/kernel-parameters.txt 
b/Documentation/admin-guide/kernel-parameters.txt
index 2b8ee90bb64470d..f7766f8ac8b9084 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -1811,6 +1811,25 @@
1 - Bypass the IOMMU for DMA.
unset - Use value of CONFIG_IOMMU_DEFAULT_PASSTHROUGH.
 
+   iommu.dma_mode= Configure default dma mode. if unset, use the value
+   of CONFIG_IOMMU_DEFAULT_PASSTHROUGH to determine
+   passthrough or not.
+   Note: For historical reasons, ARM64/S390/PPC/X86 have
+   their specific options. Currently, only ARM64 support
+   this boot option, and hope other ARCHs to use this as
+   generic boot option.
+   passthrough
+   Configure DMA to bypass the IOMMU by default.
+   lazy
+   Request that DMA unmap operations use deferred
+   invalidation of hardware TLBs, for increased
+   throughput at the cost of reduced device isolation.
+   Will fall back to strict mode if not supported by
+   the relevant IOMMU driver.
+   strict
+   DMA unmap operations invalidate IOMMU hardware TLBs
+   synchronously.
+
io7=[HW] IO7 for Marvel based alpha systems
See comment before marvel_specify_io7 in
arch/alpha/kernel/core_marvel.c.
diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index 109de67d5d727c2..df1ce8e22385b48 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -38,12 +38,13 @@
 
 static struct kset *iommu_group_kset;
 static DEFINE_IDA(iommu_group_ida);
+
 #ifdef CONFIG_IOMMU_DEFAULT_PASSTHROUGH
-static unsigned int iommu_def_domain_type = IOMMU_DOMAIN_IDENTITY;
+#define IOMMU_DEFAULT_DMA_MODE IOMMU_DMA_MODE_PASSTHROUGH
 #else
-static unsigned int iommu_def_domain_type = IOMMU_DOMAIN_DMA;
+#define IOMMU_DEFAULT_DMA_MODE IOMMU_DMA_MODE_STRICT
 #endif
-static bool iommu_dma_strict __read_mostly = true;
+static int iommu_default_dma_mode __read_mostly = IOMMU_DEFAULT_DMA_MODE;
 
 struct iommu_callback_data {
const struct iommu_ops *ops;
@@ -147,20 +148,51 @@ static int __init iommu_set_def_domain_type(char *str)
int ret;
 
ret = kstrtobool(str, );
-   if (ret)
-   return ret;
+   if (!ret && pt)
+   iommu_default_dma_mode = IOMMU_DMA_MODE_PASSTHROUGH;
 
-   iommu_def_domain_type = pt ? IOMMU_DOMAIN_IDENTITY : IOMMU_DOMAIN_DMA;
-   return 0;
+   return ret;
 }
 early_param("iommu.passthrough", iommu_set_def_domain_type);
 
 static int __init iommu_dma_setup(char *str)
 {
-   return kstrtobool(str, _dma_strict);
+   bool strict;
+   int ret;
+
+   ret = kstrtobool(str, );
+   if (!ret)
+   iommu_default_dma_mode = strict ?
+   IOMMU_DMA_MODE_STRICT : IOMMU_DMA_MODE_LAZY;
+
+   return ret;
 }
 early_param("iommu.strict", iommu_dma_setup);
 
+static int __init iommu_dma_mode_setup(char *str)
+{
+   if (!str)
+   goto fail;
+
+   if (!strncmp(str, "passthrough", 11))
+   iommu_default_dma_mode = IOMMU_DMA_MODE_PASSTHROUGH;
+   else if (!strncmp(str, "lazy", 4))
+   iommu_default_dma_mode = IOMMU_DMA_MODE_LAZY;
+   else if (!strncmp(str, "strict", 6))
+   iommu_default_dma_mode = IOMMU_DMA_MODE_STRICT;
+   else
+   goto fail;
+
+   pr_info("Force dma mode to be %d\n", iommu_default_dma_mode);
+
+   return 0;
+
+fail:
+   pr_debug("Boot option iommu.dma_mode is incorrect, ignored\n");
+   return -EINVAL;
+}
+early_param("iommu.dma_mode", iommu_dma_mode_setup);
+
 static ssize_t iommu_group_attr_show(struct kobject *kobj,
 struct attribute *__attr, char *buf)
 {
@@ -1102,14 +1134,17 @@ struct iommu_group *iommu_group_get_for_dev(struct 
device *dev)
 */
if (!group->default_domain) 

[PATCH v5 3/6] iommu: add iommu_default_dma_mode_get/set() helper

2019-04-09 Thread Zhen Lei
Also add IOMMU_DMA_MODE_IS_{STRICT|LAZT|PASSTHROUGH}() to make the code
looks cleaner.

There is no functional change, just prepare for the following patches.

Signed-off-by: Zhen Lei 
---
 drivers/iommu/iommu.c | 18 ++
 include/linux/iommu.h | 18 ++
 2 files changed, 32 insertions(+), 4 deletions(-)

diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index f4171bf4b46eaeb..86239dd46003fd4 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -195,6 +195,17 @@ static int __init iommu_dma_mode_setup(char *str)
 }
 early_param("iommu.dma_mode", iommu_dma_mode_setup);
 
+int iommu_default_dma_mode_get(void)
+{
+   return iommu_default_dma_mode;
+}
+
+void iommu_default_dma_mode_set(int mode)
+{
+   WARN_ON(mode > IOMMU_DMA_MODE_PASSTHROUGH);
+   iommu_default_dma_mode = mode;
+}
+
 static ssize_t iommu_group_attr_show(struct kobject *kobj,
 struct attribute *__attr, char *buf)
 {
@@ -1136,9 +1147,8 @@ struct iommu_group *iommu_group_get_for_dev(struct device 
*dev)
 */
if (!group->default_domain) {
struct iommu_domain *dom;
-   int def_domain_type =
-   (iommu_default_dma_mode == IOMMU_DMA_MODE_PASSTHROUGH)
-   ? IOMMU_DOMAIN_IDENTITY : IOMMU_DOMAIN_DMA;
+   int def_domain_type = IOMMU_DMA_MODE_IS_PASSTHROUGH() ?
+   IOMMU_DOMAIN_IDENTITY : IOMMU_DOMAIN_DMA;
 
dom = __iommu_domain_alloc(dev->bus, def_domain_type);
if (!dom && def_domain_type != IOMMU_DOMAIN_DMA) {
@@ -1154,7 +1164,7 @@ struct iommu_group *iommu_group_get_for_dev(struct device 
*dev)
if (!group->domain)
group->domain = dom;
 
-   if (dom && (iommu_default_dma_mode == IOMMU_DMA_MODE_LAZY)) {
+   if (dom && IOMMU_DMA_MODE_IS_LAZY()) {
int attr = 1;
iommu_domain_set_attr(dom,
  DOMAIN_ATTR_DMA_USE_FLUSH_QUEUE,
diff --git a/include/linux/iommu.h b/include/linux/iommu.h
index c3f4e3416176496..3668a8b3846996a 100644
--- a/include/linux/iommu.h
+++ b/include/linux/iommu.h
@@ -46,6 +46,12 @@
 #define IOMMU_DMA_MODE_STRICT  0x0
 #define IOMMU_DMA_MODE_LAZY0x1
 #define IOMMU_DMA_MODE_PASSTHROUGH 0x2
+#define IOMMU_DMA_MODE_IS_STRICT() \
+   (iommu_default_dma_mode_get() == IOMMU_DMA_MODE_STRICT)
+#define IOMMU_DMA_MODE_IS_LAZY() \
+   (iommu_default_dma_mode_get() == IOMMU_DMA_MODE_LAZY)
+#define IOMMU_DMA_MODE_IS_PASSTHROUGH() \
+   (iommu_default_dma_mode_get() == IOMMU_DMA_MODE_PASSTHROUGH)
 
 struct iommu_ops;
 struct iommu_group;
@@ -421,6 +427,9 @@ static inline void dev_iommu_fwspec_set(struct device *dev,
 int iommu_probe_device(struct device *dev);
 void iommu_release_device(struct device *dev);
 
+extern int iommu_default_dma_mode_get(void);
+extern void iommu_default_dma_mode_set(int mode);
+
 #else /* CONFIG_IOMMU_API */
 
 struct iommu_ops {};
@@ -705,6 +714,15 @@ const struct iommu_ops *iommu_ops_from_fwnode(struct 
fwnode_handle *fwnode)
return NULL;
 }
 
+static inline int iommu_default_dma_mode_get(void)
+{
+   return IOMMU_DMA_MODE_PASSTHROUGH;
+}
+
+static inline void iommu_default_dma_mode_set(int mode)
+{
+}
+
 #endif /* CONFIG_IOMMU_API */
 
 #ifdef CONFIG_IOMMU_DEBUGFS
-- 
1.8.3


___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v5 2/6] iommu: add build options corresponding to iommu.dma_mode

2019-04-09 Thread Zhen Lei
First, add build option IOMMU_DEFAULT_{LAZY|STRICT}, so that we have the
opportunity to set {lazy|strict} mode as default at build time. Then put
the three config options in an choice, make people can only choose one of
the three at a time, the same to the boot options iommu.dma_mode.

Signed-off-by: Zhen Lei 
---
 Documentation/admin-guide/kernel-parameters.txt |  6 ++--
 drivers/iommu/Kconfig   | 43 +
 drivers/iommu/iommu.c   |  4 ++-
 3 files changed, 42 insertions(+), 11 deletions(-)

diff --git a/Documentation/admin-guide/kernel-parameters.txt 
b/Documentation/admin-guide/kernel-parameters.txt
index f7766f8ac8b9084..92d1b3151d003c2 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -1811,9 +1811,9 @@
1 - Bypass the IOMMU for DMA.
unset - Use value of CONFIG_IOMMU_DEFAULT_PASSTHROUGH.
 
-   iommu.dma_mode= Configure default dma mode. if unset, use the value
-   of CONFIG_IOMMU_DEFAULT_PASSTHROUGH to determine
-   passthrough or not.
+   iommu.dma_mode= Configure default dma mode. if unset, use the build
+   options(such as CONFIG_IOMMU_DEFAULT_PASSTHROUGH) to
+   choose which mode to be used.
Note: For historical reasons, ARM64/S390/PPC/X86 have
their specific options. Currently, only ARM64 support
this boot option, and hope other ARCHs to use this as
diff --git a/drivers/iommu/Kconfig b/drivers/iommu/Kconfig
index 6f07f3b21816c64..1986f9767da488b 100644
--- a/drivers/iommu/Kconfig
+++ b/drivers/iommu/Kconfig
@@ -74,17 +74,46 @@ config IOMMU_DEBUGFS
  debug/iommu directory, and then populate a subdirectory with
  entries as required.
 
-config IOMMU_DEFAULT_PASSTHROUGH
-   bool "IOMMU passthrough by default"
+choice
+   prompt "IOMMU dma mode"
depends on IOMMU_API
-help
- Enable passthrough by default, removing the need to pass in
- iommu.passthrough=on or iommu=pt through command line. If this
- is enabled, you can still disable with iommu.passthrough=off
- or iommu=nopt depending on the architecture.
+   default IOMMU_DEFAULT_STRICT
+   help
+ This option allows IOMMU dma mode to be chose at build time, to
+ override the default dma mode of each ARCHs, removing the need to
+ pass in kernel parameters through command line. You can still use the
+ generic boot option iommu.dma_mode or ARCHs specific boot options to
+ override this option again.
+
+config IOMMU_DEFAULT_PASSTHROUGH
+   bool "passthrough"
+   help
+ In this mode, the dma access through IOMMU without any addresses
+ transformation. That means, the wrong or illegal dma access can not
+ be caught, no error information will be reported.
 
  If unsure, say N here.
 
+config IOMMU_DEFAULT_LAZY
+   bool "lazy"
+   help
+ Support lazy mode, where for every IOMMU DMA unmap operation, the
+ flush operation of IOTLB and the free operation of IOVA are deferred.
+ They are only guaranteed to be done before the related IOVA will be
+ reused.
+
+config IOMMU_DEFAULT_STRICT
+   bool "strict"
+   help
+ For every IOMMU DMA unmap operation, the flush operation of IOTLB and
+ the free operation of IOVA are guaranteed to be done in the unmap
+ function.
+
+ This mode is safer than the two above, but it maybe slow in some high
+ performace scenarios.
+
+endchoice
+
 config OF_IOMMU
def_bool y
depends on OF && IOMMU_API
diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index df1ce8e22385b48..f4171bf4b46eaeb 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -39,8 +39,10 @@
 static struct kset *iommu_group_kset;
 static DEFINE_IDA(iommu_group_ida);
 
-#ifdef CONFIG_IOMMU_DEFAULT_PASSTHROUGH
+#if defined(CONFIG_IOMMU_DEFAULT_PASSTHROUGH)
 #define IOMMU_DEFAULT_DMA_MODE IOMMU_DMA_MODE_PASSTHROUGH
+#elif defined(CONFIG_IOMMU_DEFAULT_LAZY)
+#define IOMMU_DEFAULT_DMA_MODE IOMMU_DMA_MODE_LAZY
 #else
 #define IOMMU_DEFAULT_DMA_MODE IOMMU_DMA_MODE_STRICT
 #endif
-- 
1.8.3


___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v5 0/6] add generic boot option for IOMMU dma mode

2019-04-09 Thread Zhen Lei
v4 --> v5:
As Hanjun and Thomas Gleixner's suggestion:
1. Keep the old ARCH specific boot options no change.
2. Keep build option CONFIG_IOMMU_DEFAULT_PASSTHROUGH no change.

v4:
As Robin Murphy's suggestion:
"It's also not necessarily obvious to the user how this interacts with
IOMMU_DEFAULT_PASSTHROUGH, so if we really do go down this route, maybe it
would be better to refactor the whole lot into a single selection of something
like IOMMU_DEFAULT_MODE anyway."

In this version, I tried to normalize the IOMMU dma mode boot options for all
ARCHs. When IOMMU is enabled, there are 3 dma modes: paasthrough(bypass),
lazy(mapping but defer the IOTLB invalidation), strict. But currently each
ARCHs defined their private boot options, different with each other. For
example, to enable/disable "passthrough", ARM64 use iommu.passthrough=1/0,
X86 use iommu=pt/nopt, PPC/POWERNV use iommu=nobypass.

Zhen Lei (6):
  iommu: add generic boot option iommu.dma_mode
  iommu: add build options corresponding to iommu.dma_mode
  iommu: add iommu_default_dma_mode_get/set() helper
  s390/pci: add support for generic boot option iommu.dma_mode
  powernv/iommu: add support for generic boot option iommu.dma_mode
  x86/iommu: add support for generic boot option iommu.dma_mode

 Documentation/admin-guide/kernel-parameters.txt | 19 +++
 arch/ia64/include/asm/iommu.h   |  2 -
 arch/ia64/kernel/pci-dma.c  |  2 -
 arch/powerpc/platforms/powernv/pci-ioda.c   |  5 +-
 arch/s390/pci/pci_dma.c | 14 ++---
 arch/x86/include/asm/iommu.h|  1 -
 arch/x86/kernel/pci-dma.c   | 35 ++--
 drivers/iommu/Kconfig   | 45 ---
 drivers/iommu/amd_iommu.c   | 10 ++--
 drivers/iommu/amd_iommu_init.c  |  4 +-
 drivers/iommu/amd_iommu_types.h |  6 --
 drivers/iommu/intel-iommu.c |  9 ++-
 drivers/iommu/iommu.c   | 73 -
 include/linux/iommu.h   | 23 
 14 files changed, 176 insertions(+), 72 deletions(-)

-- 
1.8.3


___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 01/18] drivers core: Add I/O ASID allocator

2019-04-09 Thread Andriy Shevchenko
On Tue, Apr 09, 2019 at 03:04:36AM -0700, Christoph Hellwig wrote:
> On Tue, Apr 09, 2019 at 01:00:49PM +0300, Andriy Shevchenko wrote:
> > I think it makes sense to add a helper macro to rcupdate.h
> > (and we have several cases in kernel that can utilize it)
> > 
> > #define kfree_non_null_rcu(ptr, rcu_head)   \
> > do {\
> > if (ptr)\
> > kfree_rcu(ptr, rcu_head);   \
> > } while (0)
> > 
> > as a more common pattern for resource deallocators.
> 
> I think that should move straight into kfree_rcu.  

Possible. I didn't dare to offer this due to lack of knowledge how it's used in
other places.

> In general
> we expect *free* to deal with NULL pointers transparently, so we
> should do so here as well.

Exactly my point, thanks.

-- 
With Best Regards,
Andy Shevchenko


___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 09/18] iommu/vt-d: Enlightened PASID allocation

2019-04-09 Thread Andriy Shevchenko
On Mon, Apr 08, 2019 at 04:59:24PM -0700, Jacob Pan wrote:
> From: Lu Baolu 
> 
> If Intel IOMMU runs in caching mode, a.k.a. virtual IOMMU, the
> IOMMU driver should rely on the emulation software to allocate
> and free PASID IDs. The Intel vt-d spec revision 3.0 defines a
> register set to support this. This includes a capability register,
> a virtual command register and a virtual response register. Refer
> to section 10.4.42, 10.4.43, 10.4.44 for more information.
> 
> This patch adds the enlightened PASID allocation/free interfaces
> via the virtual command register.

> + pr_debug("vcmd alloc pasid\n");

Perhaps tracepoints should be in use?
OTOH we have function tracer, so, this message in any case is a noise.
And this is applicable to other similar cases.

-- 
With Best Regards,
Andy Shevchenko


___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 08/18] iommu: Introduce cache_invalidate API

2019-04-09 Thread Andriy Shevchenko
On Mon, Apr 08, 2019 at 04:59:23PM -0700, Jacob Pan wrote:
> From: "Liu, Yi L" 
> 
> In any virtualization use case, when the first translation stage
> is "owned" by the guest OS, the host IOMMU driver has no knowledge
> of caching structure updates unless the guest invalidation activities
> are trapped by the virtualizer and passed down to the host.
> 
> Since the invalidation data are obtained from user space and will be
> written into physical IOMMU, we must allow security check at various
> layers. Therefore, generic invalidation data format are proposed here,
> model specific IOMMU drivers need to convert them into their own format.

> +int iommu_cache_invalidate(struct iommu_domain *domain, struct device *dev,
> +struct iommu_cache_invalidate_info *inv_info)
> +{
> + int ret = 0;

Redundant assignment.

> +
> + if (unlikely(!domain->ops->cache_invalidate))
> + return -ENODEV;
> +
> + ret = domain->ops->cache_invalidate(domain, dev, inv_info);
> +
> + return ret;
> +}

-- 
With Best Regards,
Andy Shevchenko


___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 01/18] drivers core: Add I/O ASID allocator

2019-04-09 Thread Christoph Hellwig
On Tue, Apr 09, 2019 at 01:00:49PM +0300, Andriy Shevchenko wrote:
> I think it makes sense to add a helper macro to rcupdate.h
> (and we have several cases in kernel that can utilize it)
> 
> #define kfree_non_null_rcu(ptr, rcu_head) \
>   do {\
>   if (ptr)\
>   kfree_rcu(ptr, rcu_head);   \
>   } while (0)
> 
> as a more common pattern for resource deallocators.

I think that should move straight into kfree_rcu.  In general
we expect *free* to deal with NULL pointers transparently, so we
should do so here as well.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 05/18] iommu: introduce device fault data

2019-04-09 Thread Andriy Shevchenko
On Mon, Apr 08, 2019 at 04:59:20PM -0700, Jacob Pan wrote:
> Device faults detected by IOMMU can be reported outside the IOMMU
> subsystem for further processing. This patch introduces
> a generic device fault data structure.
> 
> The fault can be either an unrecoverable fault or a page request,
> also referred to as a recoverable fault.
> 
> We only care about non internal faults that are likely to be reported
> to an external subsystem.
> 
> Signed-off-by: Jacob Pan 
> Signed-off-by: Jean-Philippe Brucker 
> Signed-off-by: Liu, Yi L 
> Signed-off-by: Ashok Raj 
> Signed-off-by: Eric Auger 

JFYI: we have a Co-developed-by tag as well for better granularity of what
certain person/people did.

-- 
With Best Regards,
Andy Shevchenko


___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 01/18] drivers core: Add I/O ASID allocator

2019-04-09 Thread Andriy Shevchenko
On Mon, Apr 08, 2019 at 04:59:16PM -0700, Jacob Pan wrote:
> From: Jean-Philippe Brucker 
> 
> Some devices might support multiple DMA address spaces, in particular
> those that have the PCI PASID feature. PASID (Process Address Space ID)
> allows to share process address spaces with devices (SVA), partition a
> device into VM-assignable entities (VFIO mdev) or simply provide
> multiple DMA address space to kernel drivers. Add a global PASID
> allocator usable by different drivers at the same time. Name it I/O ASID
> to avoid confusion with ASIDs allocated by arch code, which are usually
> a separate ID space.
> 
> The IOASID space is global. Each device can have its own PASID space,
> but by convention the IOMMU ended up having a global PASID space, so
> that with SVA, each mm_struct is associated to a single PASID.
> 
> The allocator doesn't really belong in drivers/iommu because some
> drivers would like to allocate PASIDs for devices that aren't managed by
> an IOMMU, using the same ID space as IOMMU. It doesn't really belong in
> drivers/pci either since platform device also support PASID. Add the
> allocator in drivers/base.
> 
> Signed-off-by: Jean-Philippe Brucker 
> ---
>  drivers/base/Kconfig   |   7 
>  drivers/base/Makefile  |   1 +
>  drivers/base/ioasid.c  | 106 
> +
>  include/linux/ioasid.h |  40 +++
>  4 files changed, 154 insertions(+)
>  create mode 100644 drivers/base/ioasid.c
>  create mode 100644 include/linux/ioasid.h
> 
> diff --git a/drivers/base/Kconfig b/drivers/base/Kconfig
> index 059700e..e05288d 100644
> --- a/drivers/base/Kconfig
> +++ b/drivers/base/Kconfig
> @@ -182,6 +182,13 @@ config DMA_SHARED_BUFFER
> APIs extension; the file's descriptor can then be passed on to other
> driver.
>  
> +config IOASID
> + bool

> + default n

Redundant.

> + help
> +   Enable the I/O Address Space ID allocator. A single ID space shared
> +   between different users.

> +/**
> + * ioasid_free - Free an IOASID
> + * @ioasid: the ID to remove
> + */
> +void ioasid_free(ioasid_t ioasid)
> +{
> + struct ioasid_data *ioasid_data;
> +
> + idr_lock(_idr);
> + ioasid_data = idr_remove(_idr, ioasid);
> + idr_unlock(_idr);
> +

> + if (ioasid_data)
> + kfree_rcu(ioasid_data, rcu);

I think it makes sense to add a helper macro to rcupdate.h
(and we have several cases in kernel that can utilize it)

#define kfree_non_null_rcu(ptr, rcu_head)   \
do {\
if (ptr)\
kfree_rcu(ptr, rcu_head);   \
} while (0)

as a more common pattern for resource deallocators.

> +}

-- 
With Best Regards,
Andy Shevchenko


___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH] iommu: Fix offsetof() usage

2019-04-09 Thread Christoph Hellwig
> index 74e944bd4a8d..81d449451494 100644
> --- a/drivers/iommu/arm-smmu.c
> +++ b/drivers/iommu/arm-smmu.c
> @@ -1484,8 +1484,7 @@ static int arm_smmu_add_device(struct device *dev)
>   }
>  
>   ret = -ENOMEM;
> - cfg = kzalloc(offsetof(struct arm_smmu_master_cfg, smendx[i]),
> -   GFP_KERNEL);
> + cfg = kzalloc(struct_size(cfg, smendx, i), GFP_KERNEL);

This looks like a huge improvement, but I find the usage of i
here still very obsfucating.  Can we please use fwspec->num_ids
directly instead of make it much more obvious?
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: revert dma direct internals abuse

2019-04-09 Thread h...@lst.de
On Mon, Apr 08, 2019 at 06:47:52PM +, Thomas Hellstrom wrote:
> We HAVE discussed our needs, although admittedly some of my emails
> ended up unanswered.

And than you haven't followed up, and instead ignored the layering
instructions and just commited a broken patch?

> We've as you're well aware of had a discussion with the other
> subsystems doing user-space DMA-buffers wanting this functionality from
> the dma api (AMD graphics and RDMA people IIRC). that is a bool that
> tells us whether streaming dma mappings are coherent, and I described
> in detail why we couldn't use the dma_sync_* API and
> dma_alloc_coherent().

And that is not at all what you either check or claim to do in the
changelog (which btw, are two different things).

> The other option we have is to just fail miserably without messages if
> streaming DMA is not coherent, which I think the other drivers might
> do... That's all I'm trying to avoid here. I'd much prefer to have the
> dma API export this bool.

Both DMA direct and non-DMA direct streaming mappings can be either
coherent or incoherent, so your patch doesn't archive that.  The
commit log claims the following:

"drm/vmwgfx: Improve on IOMMU detection

 instead of relying on intel_iommu_enabled, use the fact that the
 dma_map_ops::map_page != dma_direct_map_page"

which has nothing to do with the fact that streaming mappings are
coherent.  It also is incorrect as there are direct mapping that
do not use dma_direct_map_page (e.g. on ARM, or x86 with VMD).
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 00/18] Shared virtual address IOMMU and VT-d support

2019-04-09 Thread Andriy Shevchenko
On Mon, Apr 08, 2019 at 04:59:15PM -0700, Jacob Pan wrote:
> Shared virtual address (SVA), a.k.a, Shared virtual memory (SVM) on Intel
> platforms allow address space sharing between device DMA and applications.
> SVA can reduce programming complexity and enhance security.
> This series is intended to enable SVA virtualization, i.e. shared guest
> application address space and physical device DMA address. Only IOMMU portion
> of the changes are included in this series. Additional support is needed in
> VFIO and QEMU (will be submitted separately) to complete this functionality.
> 
> To make incremental changes and reduce the size of each patchset. This series
> does not inlcude support for page request services.
> 
> In VT-d implementation, PASID table is per device and maintained in the host.
> Guest PASID table is shadowed in VMM where virtual IOMMU is emulated.

This seems missed the comments I gave you internally.

> 
> .-.  .---.
> |   vIOMMU|  | Guest process CR3, FL only|
> | |  '---'
> ./
> | PASID Entry |--- PASID cache flush -
> '-'   |
> | |   V
> | |CR3 in GPA
> '-'
> Guest
> --| Shadow |--|
>   vv  v
> Host
> .-.  .--.
> |   pIOMMU|  | Bind FL for GVA-GPA  |
> | |  '--'
> ./  |
> | PASID Entry | V (Nested xlate)
> '\.--.
> | |   |SL for GPA-HPA, default domain|
> | |   '--'
> '-'
> Where:
>  - FL = First level/stage one page tables
>  - SL = Second level/stage two page tables
> 
> 
> This work is based on collaboration with other developers on the IOMMU
> mailing list. Notably,
> 
> [1] [PATCH v6 00/22] SMMUv3 Nested Stage Setup by Eric Auger
> https://lkml.org/lkml/2019/3/17/124
> 
> [2] [RFC PATCH 2/6] drivers core: Add I/O ASID allocator by Jean-Philippe
> Brucker
> https://www.spinics.net/lists/iommu/msg30639.html
> 
> [3] [RFC PATCH 0/5] iommu: APIs for paravirtual PASID allocation by Lu Baolu
> https://lkml.org/lkml/2018/11/12/1921
> 
> There are roughly three parts:
> 1. Generic PASID allocator [1] with extension to support custom allocator
> 2. IOMMU cache invalidation passdown from guest to host
> 3. Guest PASID bind for nested translation
> 
> All generic IOMMU APIs are reused from [1], which has a v7 just published with
> no real impact to the patches used here. It is worth noting that unlike sMMU
> nested stage setup, where PASID table is owned by the guest, VT-d PASID table 
> is
> owned by the host, individual PASIDs are bound instead of the PASID table.
> 
> 
> Jacob Pan (15):
>   ioasid: Add custom IOASID allocator
>   ioasid: Convert ioasid_idr to XArray
>   driver core: add per device iommu param
>   iommu: introduce device fault data
>   iommu: introduce device fault report API
>   iommu: Introduce attach/detach_pasid_table API
>   iommu/vt-d: Add custom allocator for IOASID
>   iommu/vt-d: Replace Intel specific PASID allocator with IOASID
>   iommu: Add guest PASID bind function
>   iommu/vt-d: Move domain helper to header
>   iommu/vt-d: Add nested translation support
>   iommu/vt-d: Add bind guest PASID support
>   iommu: add max num of cache and granu types
>   iommu/vt-d: Support flushing more translation cache types
>   iommu/vt-d: Add svm/sva invalidate function
> 
> Jean-Philippe Brucker (1):
>   drivers core: Add I/O ASID allocator
> 
> Liu, Yi L (1):
>   iommu: Introduce cache_invalidate API
> 
> Lu Baolu (1):
>   iommu/vt-d: Enlightened PASID allocation
> 
>  drivers/base/Kconfig|   7 ++
>  drivers/base/Makefile   |   1 +
>  drivers/base/ioasid.c   | 211 +
>  drivers/iommu/Kconfig   |   1 +
>  drivers/iommu/dmar.c|  48 +
>  drivers/iommu/intel-iommu.c | 219 --
>  drivers/iommu/intel-pasid.c | 191 +-
>  drivers/iommu/intel-pasid.h |  24 -
>  drivers/iommu/intel-svm.c   | 217 +++---
>  drivers/iommu/iommu.c   | 207 +++-
>  include/linux/device.h  |   3 +
>  include/linux/intel-iommu.h |  40 +--
>  include/linux/intel-svm.h   |   7 ++
>  include/linux/ioasid.h  |  66 
>  include/linux/iommu.h   | 127 +++
>  include/uapi/linux/iommu.h  | 248 
> 
>  16 files changed, 1559 insertions(+), 58 deletions(-)
>  create mode 100644 drivers/base/ioasid.c
>  create mode 100644 include/linux/ioasid.h
>  create mode 100644 include/uapi/linux/iommu.h
> 
> 

Re: [RESEND PATCH v4 0/9] mm: Use vm_map_pages() and vm_map_pages_zero() API

2019-04-09 Thread Souptick Joarder
Hi Andrew/ Michal,

On Mon, Apr 1, 2019 at 10:56 AM Souptick Joarder  wrote:
>
> Hi Andrew,
>
> On Tue, Mar 19, 2019 at 7:47 AM Souptick Joarder  wrote:
> >
> > Previouly drivers have their own way of mapping range of
> > kernel pages/memory into user vma and this was done by
> > invoking vm_insert_page() within a loop.
> >
> > As this pattern is common across different drivers, it can
> > be generalized by creating new functions and use it across
> > the drivers.
> >
> > vm_map_pages() is the API which could be used to map
> > kernel memory/pages in drivers which has considered vm_pgoff.
> >
> > vm_map_pages_zero() is the API which could be used to map
> > range of kernel memory/pages in drivers which has not considered
> > vm_pgoff. vm_pgoff is passed default as 0 for those drivers.
> >
> > We _could_ then at a later "fix" these drivers which are using
> > vm_map_pages_zero() to behave according to the normal vm_pgoff
> > offsetting simply by removing the _zero suffix on the function
> > name and if that causes regressions, it gives us an easy way to revert.
> >
> > Tested on Rockchip hardware and display is working fine, including talking
> > to Lima via prime.
> >
> > v1 -> v2:
> > Few Reviewed-by.
> >
> > Updated the change log in [8/9]
> >
> > In [7/9], vm_pgoff is treated in V4L2 API as a 'cookie'
> > to select a buffer, not as a in-buffer offset by design
> > and it always want to mmap a whole buffer from its beginning.
> > Added additional changes after discussing with Marek and
> > vm_map_pages() could be used instead of vm_map_pages_zero().
> >
> > v2 -> v3:
> > Corrected the documentation as per review comment.
> >
> > As suggested in v2, renaming the interfaces to -
> > *vm_insert_range() -> vm_map_pages()* and
> > *vm_insert_range_buggy() -> vm_map_pages_zero()*.
> > As the interface is renamed, modified the code accordingly,
> > updated the change logs and modified the subject lines to use the
> > new interfaces. There is no other change apart from renaming and
> > using the new interface.
> >
> > Patch[1/9] & [4/9], Tested on Rockchip hardware.
> >
> > v3 -> v4:
> > Fixed build warnings on patch [8/9] reported by kbuild test robot.
> >
> > Souptick Joarder (9):
> >   mm: Introduce new vm_map_pages() and vm_map_pages_zero() API
> >   arm: mm: dma-mapping: Convert to use vm_map_pages()
> >   drivers/firewire/core-iso.c: Convert to use vm_map_pages_zero()
> >   drm/rockchip/rockchip_drm_gem.c: Convert to use vm_map_pages()
> >   drm/xen/xen_drm_front_gem.c: Convert to use vm_map_pages()
> >   iommu/dma-iommu.c: Convert to use vm_map_pages()
> >   videobuf2/videobuf2-dma-sg.c: Convert to use vm_map_pages()
> >   xen/gntdev.c: Convert to use vm_map_pages()
> >   xen/privcmd-buf.c: Convert to use vm_map_pages_zero()
>
> Is it fine to take these patches into mm tree for regression ?

v4 of this series has not received any further comments/ kbuild error
in last 8 weeks (including
the previously posted v4).

Any suggestion, if it safe to take these changes through mm tree ? or any
other tree is preferred ?

>
> >
> >  arch/arm/mm/dma-mapping.c  | 22 ++
> >  drivers/firewire/core-iso.c| 15 +---
> >  drivers/gpu/drm/rockchip/rockchip_drm_gem.c| 17 +
> >  drivers/gpu/drm/xen/xen_drm_front_gem.c| 18 ++---
> >  drivers/iommu/dma-iommu.c  | 12 +---
> >  drivers/media/common/videobuf2/videobuf2-core.c|  7 ++
> >  .../media/common/videobuf2/videobuf2-dma-contig.c  |  6 --
> >  drivers/media/common/videobuf2/videobuf2-dma-sg.c  | 22 ++
> >  drivers/xen/gntdev.c   | 11 ++-
> >  drivers/xen/privcmd-buf.c  |  8 +--
> >  include/linux/mm.h |  4 ++
> >  mm/memory.c| 81 
> > ++
> >  mm/nommu.c | 14 
> >  13 files changed, 134 insertions(+), 103 deletions(-)
> >
> > --
> > 1.9.1
> >
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu